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FYFniTIVE  SUMMARY 

Selected  components  of  the  MISA  program  relating  to  sampling  frequency 
requirements  were  investigated,  specifically: 

•  estimation  of  monthly  means  for  use  in  the  development  of  BATEA  effluent 

limits,  and 

•  data  characterization  for  the  determination  of  presence/absence  of  a  compound. 

Basic  statistical  procedures  and  approaches  used  by  various  researchers  were  also 
employed  here  to  investigate  the  above  aspects  of  the  MISA  program. 

The  first  study  component  involved  investigation  of  the  effect  of  four  relative  levels  of 
variability  of  an  industrial  effluent  constituent  on  the  resulting  accuracy  of  estimates  of 
the  mean.  A  package  of  computer  programs  was  developed  to  assist  in  the  analysis. 
The  problem  of  a  lack  of  actual  data  for  use  in  our  analysis  was  overcome  by  using  a 
series  of  artificially  generated  data  sets  to  represent  different  levels  of  variability. 

The  second  study  component  involved  the  determination  of  the  minimum  sampling 
frequency  capable  of  detecting  various  levels  of  pollutant  occurrence.  Basic  statistical 
procedures  and  computer  programs  were  used  to  generate  artificial  data  and  repeatedly 
sample  the  data  bases  to  estimate  sampling  efficiencies. 

In  the  case  of  industrial  effluent  constituents  with  high  and  very  high  variability  (i.e., 
coefficient  of  variation  (CV)  >60%),  the  study  found  that  greater  than  thrice  weekly 
sampling  was  required  to  meet  the  assumed  accuracy  goal  of  (±25%)  for  estimation  of 
monthly  means  to  be  used  for  development  of  BATEA  effluent  limits.  At  least  thrice 
weekly  sampling  was  required  for  medium  variability  (30%  <  CV  <  60%)  and  at  least 
weekly  for  low  variability  (CV  <30%). 


The  minimum  sampling  frequencies  required  to  identify  the  presence  of  constituents  (at 
least  80%*  of  the  time)  for  various  values  of  0  (probability  of  a  constituent  being 
above  the  detection  limit  on  any  given  day)  were  identified  as  follows: 

Frequency  Suitable  For 

d 
Monthly  >.15 

Bi-monthly  >.25 

Quarterly  >.35 

Semi-annual  >.50  (*at  75%)  or  >.40  (*at  64%) 

The  methods  presented  in  this  study  can  be  used  to  assess  actual  industrial  monitoring 
data  as  they  become  available.  The  MISA  objectives  should  be  reviewed  in  terms  of 
the  statistical  requirements  to  achieve  stated  goals.  Specifically,  the  accuracy  and 
precision  requirements  for  BATEA  data  sets  must  be  determined  (or  values  assumed). 
Also  the  minimum  $  that  the  characterization  program  is  to  detect  should  be  specified 
along  with  the  minimum  probability  for  detection  that  is  acceptable.  As  these  values 
are  refined,  the  calculations  in  this  report  should  be  updated. 
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1.0  INTRODUCTION 

1.1  BACKGROUND 

The  MISA  Advisory  Committee  (MAC)  is  a  group  of  independent  technical  and 
environmental  experts.  MAC  was  established  in  November,  1986  to  review  draft 
regulations  and  to  provide  advice  and  recommendations  to  the  Minister  of  the 
Environment  concerning  the  Municipal/Industrial  Strategy  for  Abatement  (MISA) 
program. 

MAC  is  concerned  about  the  statistical  validity  of  certain  sampling  frequencies 
proposed  in  the  draft  regulations.  The  Committee  wishes  to  determine  monitoring 
criteria  necessary  to  develop  a  data  base  which  will  provide  for  the  achievement  of 
MISA  program  objectives. 

Gartner  Lee  Limited  was  retained  to  conduct  a  statistical  assessment  of  selected 
components  of  the  MISA  monitoring  approach. 

1.2  OBJECTIVES 

The  three  objectives  for  this  study  were: 

1.  to  specify  statistically  justifiable  sampling  frequencies  and  protocols  required  to 
provide  valid  data  (concentrations  and  loadings  of  conventional  and  organic 
parameters)  on  which  to  base  BATEA*, 

2.  to  specify  the  data  base  characteristics  (i.e.  data  quantity,  quality  and  frequency) 
necessary  to  determine,  within  reasonable  confidence  limits,  the  presence  or 
absence  of  all  Effluent  Monitoring  Priority  Pollutant  List  (EMPPL)  compounds 
in  effluents,  and 

3.  to  specify  the  data  base  characteristics  necessary  to  identify  compounds  in 
effluents  which  are  not  on  the  EMPPL. 


*Terms  in  italics  are  defined  in  the  glossary  contained  at  the  end  of  this  report 


1.3  APPROACH 

The  questions  raised  by  the  MISA  Advisory  Committee  deal  with  design  of  water 
quality  monitoring  programs,  specifically  sampling  frequency.  This  topic  has  received 
considerable  attention  in  recent  years  (e.g.,  Ward,  et  al,  1979;  Loftis,  et  al,  1983;  Loftis, 
et  al,  1987).  The  following  general  approach  (or  some  variation  of  it)  is  commonly  used 
in  designing  monitoring  networks: 

1.  Define  monitoring  objectives, 

2.  Express  objectives  in  statistical  terms, 

3.  Determine  parameters,  frequency,  station  locations,  etc.  to  achieve  program 
objectives, 

4.  Implement  monitoring  program, 

5.  Report, 

6.  Evaluate  and  adjust  network  at  regular  intervals. 

In  this  study,  we  are  primarily  concerned  with  item  3  relating  to  the  specification  of 
sampling  frequency.  To  examine  this  aspect;  however,  it  is  also  necessary  to  look  at 
item  2,  namely,  expressing  certain  MISA  objectives  or  sub-objectives  in  statistical  terms. 

To  provide  a  definitive  specification  of  sampling  frequency  in  a  program,  it  is  necessar}' 
to  have  a  suitable  set  of  data  (i.e.  a  preliminary  data  set)  that  will  provide  the  necessary 
statistical  information  (e.g.  water  qualit)'  variability).  Usually  no  data  or  only  limited 
data  are  available  at  the  beginning  of  a  study  thus  data  collected  elsewhere  or  statistical 
"rules  of  thumb"  are  often  used  to  design  a  preliminary  monitoring  phase  which  is  then 
re-designed  once  suitable  data  (usually  one  year)  have  been  collected. 

The  preliminary  data  set  must  be  sufficiently  detailed  to  allow  the  use  of  statistical 
techniques  to  characterize  the  data  in  terms  such  as  mean,  standard  deviation, 
confidence  limits  and  significance  level.  Where  water  quality  is  variable  due  to  random 
and/or  systematic  variability,  statistical  values  (such  as  the  mean),  are  estimates  of  the 
true  values. 


The  difference  between  estimated  and  true  values  can  be  calculated  statistically.  This 
difference  is  related  to  the  variability  of  the  water  quality  parameter  being  measured, 
which  in  turn  is  related  to: 

•  effluent  quality  and  quantity  related  to  the  process, 

•  sampling,  preservation  and  handling  procedures, 

•  laboratory  analytical  techniques,  and 

•  quality  assurance  and  protocols. 

When  the  preliminary  data  set  has  been  obtained  it  is  used  to  determine  the  inherent 
variability  of  the  data.  Once  this  variability  is  known  the  sampling  frequency  can  be 
modified  to  achieve  stated  precision  goals  for  the  program. 

In  the  present  study,  only  limited  data  were  available  concerning  water  quality  in 
industrial  effluents  (i.e.,  the  preliminary  data  set  is  not  sufficient).  This  limitation  was 
addressed  by  simulating  a  data  base  considered  to  be  representative  of  general  classes 
of  industries.  The  simulated  data  base  was  then  sampled  according  to  various  scenarios 
(e.g.  weekly,  thrice  weekly)  to  assess  effectiveness  of  each. 

1.4  STUDY  SCOPE  AND  ASSUMPTIONS 

The  scope  of  this  study  was  limited  to  specific  questions  of  sampling  frequency.  Other 
issues  such  as  station  location,  sample  collection  and  lab  analysis,  although  important 
considerations  in  monitoring  design,  were  not  addressed  in  this  report.  The  question  of 
how  the  information  generated  by  the  monitoring  programs  will  be  used  was  not 
addressed.  For  example,  it  was  possible  to  quantify  confidence  interx'als  and  accuracy  for 
various  sampling  frequencies  and  parameter  variabilities.  However,  it  was  not  possible 
to  assess  whether  a  particular  confidence  interval  and  confidence  level  or  accurac}'  will 
be  suitable  for  development  of  (BATEA)  effluent  limits.  To  overcome  this  limitation, 
assumptions  were  made  concerning  the  required  accuracy  ditxd precision  for 
development  of  BATEA  effluent  limits. 


1.5  REPORT  ORGANIZATION 

The  report  is  presented  in  two  separately  bound  volumes.  The  main  report  is  contained 

in  Volume  1  (this  volume).  Volume  2  contains  the  technical  appendices. 

The  main  report  contains  an  Executive  Summary  and  Glossary  of  Terms  to  facilitate 
understanding  of  the  report.  The  technical  appendices  are  comprised  of  three  parts. 
An  overview  assessment  of  the  expected  variability  of  industrial  effluent  quality  and 
quantity  is  presented  in  Appendix  A.  The  documentation  of  the  computer  "model" 
developed  for  this  project  is  contained  in  Appendix  B.  Executable  files  and  sample  data 
for  the  BASIC  programs  and  MINITAB  macros  are  included  on  a  floppy  disk  which 
accompanies  Volume  2  of  this  report.  Appendix  C  contains  the  simulated  data  bases 
used  as  examples  in  the  report. 


2.0  METHODS 

2.1  TNDT  JSTRIAT .  FFFLUENTS 

An  overview  assessment  was  undertaken  by  Zenon  Environmental  Inc.  (Canning,  1988) 
to  provide  insight  and  background  information  concerning  the  range  of  variability  likely 
to  be  encountered  in  industrial  effluent  quality  and  quantity.  The  assessment  was 
limited  in  scope,  intended  to  provide  an  approximate  range  of  expected  conditions 
which  could  be  used  as  a  framework  for  investigating  sample  frequency  requirements. 

The  assessment  was  accomplished  by  a  review  of  selected  references.  The  results  are 
presented  in  Appendix  A. 

2.2  STATISTICS 

The  procedures  used  to  investigate  sample  frequency  requirements  from  a  network 
design  perspective  were  based  primarily  on  the  work  of  several  investigators  at 
Colorado  State  University.  These  network  design  procedures  have  been  presented  in 
several  references,  e.g.  (Ward,  et  al,  1979;  Sanders,  et  al,  1979).  The  statistical 
formulae,  theorems,  etc.  used  in  calculation  (e.g.  confidence  intervals,  means,  etc.)  are 
available  in  any  statistical  text.  For  this  project  Freund,  1962,  Spiegel,  1961;  and 
Yevjevich,  1972  were  used  as  general  references  for  statistics  formulae. 

2.3  COMPUTER  MODEL 

A  computer  model  was  developed  to  accomplish  the  following: 

1.  simulate  a  data  base, 

2.  sample  the  data  base  in  different  ways,  and 

3.  calculate  descriptive  statistics  for  various  sampling  scenarios. 

The  general  purpose  of  the  model  was  to  demonstrate  the  statistical  principles  involved 
in  determining  sample  frequency  requirements.  In  the  absence  of  an  adequate 
preliminary  data  set  now,  it  was  decided  to  include  a  simulation  (generation  of  artificial 
data)  option.  This  component  generates  an  artificial  data  base  of  predetermmed 
characteristics. 


As  real  data  become  available,  they  can  be  analyzed  by  these  programs  and  the  results 
used  to  refine  the  monitoring  program. 

Two  complementary  sets  of  programs  were  developed,  one  programmed  in  BASIC  and 
the  other  programmed  in  MINITAB  (a  statistical  analysis  package).  The  programs 
were  designed  to  work  together,  i.e.  the  output  of  one  can  be  analyzed  by  the  other. 

BASIC  was  selected  because  it  facilitated  the  simulation  of  data  in  a  graphical  format. 
MINITAB  was  selected  because  of  its  programming  ability  and  availability  of  a  wide 
range  of  statistical  procedures^ 


INote:  There  are  several  excellent  statistical  analysis  packages  commercially  available.  The  use  of 
MINITAB  in  this  study  should  not  be  considered  an  endorsement. 


3.0  FINDINGS 

3.1  INTRODUCTION 

This  chapter  presents  the  study  findings  relating  to: 

•  estimation  of  mean  monthly  concentrations  and  loads  and, 

•  determination  of  the  presence  or  absence  of  compounds. 

3.2  ESTIMATES  OF  MONTHLY  MEANS 

This  section  examines  the  first  study  objective,  namely;  the  question  of  frequency  of 
sampling  necessary  to  provide  data  on  which  to  base  BATEA  effluent  limits.  Of 
specific  interest  is  the  question  of  whether  or  not  thrice  weekly  sampling  can  be  reduced 
to  weekly  sampling  while  maintaining  an  acceptable  level  of  accuracy  in  the  data. 

To  answer  this  question,  it  is  necessary  to  know  how  BATEA  effluent  limits  will  be 
derived  and  to  understand  the  nature  and  variability  of  effluent  flows  and  quality. 
Unfortunately,  this  information  is  only  partly  known  at  this  time.  Consequently,  to 
complete  the  investigation  it  is  necessary  to  make  the  following  assumptions. 

In  regards  to  the  derivation  of  BATEA  effluent  limits  it  has  been  assumed  that: 

1.  the  relative  error  in  estimating  the  mean  should  not  exceed  +25%  (as  a  measure 
of  accuracy), 

2.  the  relative  precision  in  estimating  the  mean  should  not  exceed  +50%,  and 

3.  a  95%  level  of  confidence  should  be  used. 

(Assumption  #1  and  #3  are  consistent  with  those  expressed  by  A.  Sharma  (1988). 

An  overview  assessment  was  undertaken  by  Zenon  Environmental  Inc.  (Canning,  1988) 
to  provide  insight  and  background  information  on  the  range  of  effluent  quality  and 
quantit)  likely  to  be  found  in  Ontario's  industrial  effluents  (Appcndbc  A).  Based  on  the 


overview  the  following  relative  variability  levels  have  been  assumed.  The  levels  are 
expressed  in  terms  of  the  coefficient  of  variation  (CV). 

Relative  Variability 

Low  CV  <  30% 

Medium  30%  <  CV  <  60% 

High  60%  <  CV  <  200% 

Very  High  CV  >  200% 


3.2.1  Accuracy 

Accuracy  refers  to  the  degree  to  which  the  estimates  from  a  measurement  technique 
agree  with  the  true  value.  The  value  being  estimated  in  this  case  is  the  mean  monthly 
concentration.  A  measure  of  accuracy  is  provided  by  the  following  expression. 


u  -X 

A  =     *  100 


where:  A 'delta)  is  the  relative  error,  expressed  as  a  percentage, 
pt   is  the  true  mean, 
X  is  the  estimated  or  sample  mean  derived  by  some  sampling  scheme. 

As  stated  in  Section  3.2  our  assumed  goal  is  to  achieve  a  value  for  A  of  less  than 
+25%.  Thrice  weekly  and  weekly  sampling  schemes  were  investigated  for  four  "typical" 
industries  which  represent  a  broad  range  of  relative  variability  levels  as  explained 
below.  A  generic  profile  of  the  "typical"  industry  characteristics  follows.  Data  for  each 
industry  were  simulated  using  the  programs  described  in  Appendix  B.  The  results 
appear  in  Appendix  C3. 


Industry  #1  represents  "Low"  relative  variability.  Continuous  operation  (24  hr/day  and 
7  daysAveek)  plus  a  high  level  of  effluent  treatment  (including  biological  treatment) 
tends  to  produce  effluents  which  vary  over  a  relatively  narrow  range  of  quality  and 
quantity,'  compared  to  other  groups.  An  example  industry  might  be  found  in  the 
petroleum  refining  sector. 

Industry  #2  represents  "Medium"  relative  variability.  This  industr}'  uses  large, 
continuous  production  facilities  and  involves  a  high  usage  of  industrial  chemicals. 
Effluent  quality  and  quantity  exhibit  relatively  moderate  changes;  however,  plant  upsets 
can  cause  occasional  high  values.  An  example  industn,-  could  be  an  inorganic  chemical 
manufacturing  plant. 

Industry  #3  represents  "High"  relative  variabilit\'.  This  industn,  uses  large  batch 
processes  to  manufacture  specialty  chemicals.  The  process  involves  a  large  product 
mix,  e.g.  pharmaceuticals,  paints,  dyes,  inks,  etc.  Minimum  treatment  will  enhance  the 
variabilit}'  of  effluent  quantity  and  qualit}'.  An  example  of  this  industrial  t\pe  mav  be 
found  in  the  organic  chemical  manufacturing  sector. 

Industry  #4  represents  "ver\'  high"  relative  variability.  Although  the  qualit>'  of  process 
effluent  from  this  industrv'  may  be  relatively  constant,  there  are  large  contributing  areas 
where  contaminated  surface  runoff  can  be  discharged  in  response  to  rainfall  and/or 

melt  conditions. 

The  data  for  this  industry'  were  simulated  using  the  programs  in  Appendbc  B  to 
represent  an  actual  industrial  example.  The  model  for  this  example  is  a  base  metal 
mining  industry  in  the  Elliot  Lake  area.  The  monitoring  point  includes  runoff  from 
tailings  areas.  The  observed  mean  suspended  solids  concentration  based  on  83  weekly 
average  values  was  3.75  mg/L  with  a  standard  deviation  of  16.34  mg/L.  The  coefficient 
of  variation  for  this  parameter  was  435%. 

The  mean  of  the  constituent  to  be  modelled  was  4.16  mg/L  for  simulated  data  base 
with  a  standard  deviation  of  13.88  mg,!..  The  coefficient  of  variation  was  334%. 
Simulated  data  for  this  industry  are  contained  in  Appendix  C3. 


The  "SIMULATE"  program  (in  Appendix  B)  was  used  to  sample  the  four  (simulated) 
industrial  data  bases  for  thrice  weekly  and  weekly  sampling  scenarios.  The  program 
also  calculated  the  mean  based  on  daily  sampling  which  was  assumed  to  be  the  true 
value.  Each  industry  was  sampled  twelve  times  (i.e.  monthly). 

The  results  were  entered  into  a  LOTUS  1-2-3  spreadsheet  for  further  analysis  (Table 
1).  The  estimated  means  were  plotted  against  the  true  mean  to  provide  a  visual 
representation  of  the  accuracy  of  the  two  sampling  schemes.  The  results  are  shown  in 
Figures  la  and  lb.  The  1:1  line  as  well  as  the  ±25%  error  lines  are  shown.  Values 
falling  outside  of  the  envelope  failed  to  meet  the  assumed  accuracy  criterion  of +25% 
of  the  true  mean. 

Figure  lb  shows  the  lower  concentration  range  for  Figure  la.  Sampling  scheme 
efficiency  is  summarized  in  Table  2.  At  low  relative  variability  both  sampling  schemes 
produced  results  that  were  completely  within  our  stated  accuracy  goals.  At  medium 
relative  variability  thrice  weekly  sampling  still  produced  results  which  were  accurate 
100%  of  the  time  whereas  2  out  of  12  (or  17%)  of  the  samples  from  weekly  sampling 
had  unacceptable  accuracy.  At  high  relative  variability  both  results  produced  some 
inaccuracies;  however,  the  weekly  sampling  results  were  in  error  the  most  8  out  of  12 
times  or  (67%).  At  very  high  relative  variability  both  sampling  schemes  performed 
poorly. 

3.2.2  Confidence  Intervals 

A  confidence  interval  is  defined  as  a  range  around  an  estimated  sample  statistic,  such  as 
the  mean.  A  statement  can  be  made  about  the  probability  of  this  interval  to  include  the 
population  mean,  fx  (i.e.,  the  true  mean).  This  statement  is  usually  made  at  a  95% 
level  of  confidence.  This  means  that  on  the  average  the  true  mean  will  be  within  the 
stated  confidence  interval  95  out  of  100  times.  Ideally,  we  would  like  to  have  a 
confidence  interval  width  as  close  to  zero  as  possible,  i.e.  we  would  like  to  be  as  precise 
as  possible. 
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TABLE  1  COMPARISON  OF  TRUE  MONTHLY  MEAN  CONCENTRATION  VS. 
ESTIMATED 


VARIABILITY 

MONTH 

TRUE 

THRICE 

UEEKLY 

Low 

1 

41.54 

43.31 

40.41 

2 

46.86 

46.29 

50.08 

3 

49.85 

50.40 

49.64 

4 

43.47 

44.44 

43.16 

5 

44.64 

44.42 

45.38 

6 

45.89 

44.75 

44.55 

7 

48.32 

49.41 

44.90 

8 

41.35 

41.19 

39.88 

9 

47.58 

46.16 

46.63 

10 

45.30 

46.13 

44.93 

11 

46.01 

46.82 

43.19 

12 

41.77 

41.80 

41.46 

Medium 

1 

65.20 

64.38 

72.25 

2 

76.30 

81.85 

83.00 

3 

80.40 

70.23 

101.50 

4 

67.47 

60.33 

89.20 

5 

74.33 

72.77 

64.50 

6 

63.63 

65.23 

71.25 

7 

83.17 

75.67 

86.20 

8 

85.77 

85.00 

94.75 

9 

81.00 

85.08 

81.00 

10 

76.87 

69.92 

86.50 

11 

86.53 

96.83 

90.60 

12 

79.13 

83.92 

81.00 

High 

1 

41.72 

41.83 

70.76 

2 

35.06 

36.91 

6.38 

3 

50.09 

56.41 

25.39 

4 

60.61 

57.61 

60.08 

5 

96.50 

103.55 

172.16 

6 

86.24 

73.82 

121.68 

7 

70.41 

70.24 

54.65 

8 

63.86 

47.35 

94.26 

9 

85.87 

89.40 

45.57 

10 

70.37 

70.54 

78.38 

11 

78.98 

91.17 

73.71 

12 

76.27 

72.45 

128.83 

Very  High 

1 

0.60 

0.77 

0.50 

2 

2.90 

4.38 

1.00 

3 

1.60 

1.08 

5.00 

U 

19.27 

23.75 

0.80 

5 

2.90 

2.46 

5.25 

6 

5.33 

5.15 

9.25 

7 

1.80 

2.58 

1.80 

8 

2.83 

3.92 

1.00 

9 

1.67 

0.85 

1.75 

10 

2.60 

2.23 

2.50 

11 

4.93 

5.17 

5.00 

12 

3.83 

3.69 

5.50 
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TABLE  2:  SAMPLING  SCHEME  EFFICIENCY  -  ACCURACY 


Relative 
Variability 
Level   (CV) 


Coefficient 
of 

Variation 

(CV) 


Percentage  of  Samples 
with  delta  >+25% 


Thrice 


Weekly 


Verv'  High 
High 
Medium 
Low 


334% 
92% 
38% 
21% 


50% 

58% 

8% 

67% 

0% 

17% 

0% 

0% 

A  relative  measure  of  precision  was  obtained  by  expressing  the  confidence  interval 
width  as  a  percentage  of  the  mean.  In  this  study,  we  have  selected  the  goal  that  the 
relative  precision  should  be  less  than  ±50%  of  the  mean. 

The  confidence  interval  is  a  function  of  the  standard  deviation  and  the  sample  size 
Since  we  generally  have  no  control  over  the  standard  deviation,  it  is  necessarv  to  adjust 
sampling  size  to  change  the  width  of  the  confidence  inter\al. 

The  confidence  interval  width  may  be  calculated  from  the  confidence  limits.  The 
confidence  limits  are  the  numerical  limits  of  the  confidence  interval.  For  samples  of 
small  size  (i.e.,  N  <30)  the  confidence  limits  about  the  mean  (x)  can  be  calculated  using 

A 

Student's  t  distribution,  the  standard  deviation  (Sx)  and  the  number  of  samples  (N) 
according  to  the  expression: 


X  -  (t(n-l)a/2  •  Sx]  <  X  <  X  +  [t(n-l)a/2  •  Sx] 

TT  ./N 
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The  effect  of  increasing  sample  size  on  the  confidence  interval  width  can  be  seen  in  a 
general  way  in  Figure  2.  In  this  graph,  the  effect  of  the  standard  deviation  statistic  has 
been  removed  by  assuming  it  equal  for  all  cases.  The  normalized  confidence  interval 
width  is  depicted  for  the  95%  confidence  level. 

As  N  increases  the  normalized  confidence  interval  width  approaches  the  goal  of  zero. 
The  difference  between  sampling  thrice  weekly  and  weekly  can  be  seen  in  Figure  2.  At 
the  95%  confidence  level  the  confidence  interval  width  is  about  2.7  times  greater  for 
weekly  than  for  thrice  weekly. 

When  standard  deviation  is  considered  as  a  factor,  the  difference  between  thrice  weekly 
and  weekly  increases  as  standard  deviation  increases.  This  is  examined  in  greater  detail 
using  the  "typical"  industrial  effluent  data  base  introduced  in  Section  3.2.1. 

As  discussed  previously  the  "SIMULATE"  program  was  used  to  generate  and  sample 
four  data  bases  according  to  thrice  weekly  and  weekly  sampling  schemes.  The  95% 
confidence  interval  widths  were  calculated  for  each  industry  and  sampling  scheme  on  a 
monthly  basis.  The  results  appear  in  Appendbc  C3. 

The  relative  confidence  interval  widths  were  calculated  for  each  sampling  scenario. 
The  results  are  contained  in  Table  3.  A  value  greater  then  100%  represents  failure  to 
achieve  our  assumed  goal  of  confidence  interval  width  (CIW)  to  be  less  than  +50%  of 
the  mean. 
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TABLE  3  RELATIVE  CONHDENCE  INTERNAL  WIDTH 


VARIABILITY 

MONTH 

THRICE 

WEEKLY 

(%) 

(%) 

Lou 

1 

23.6 

92.0 

2 

2U.U 

185.0 

3 

33.7 

66.8 

U 

18.7 

41.9 

5 

20.9 

29.7 

6 

24.1 

17.3 

7 

15.7 

43.9 

8 

33.9 

22.2 

9 

20.1 

50.5 

10 

17.1 

53.8 

11 

27.1 

22.9 

12 

20.5 

93.8 

Medium 

1 

49.4 

50.4 

2 

44.8 

231.5 

3 

33.1 

113.4 

A 

40.5 

108.8 

5 

37.4 

129.4 

6 

41.9 

92.0 

7 

50.4 

74.9 

8 

48.0 

47.0 

9 

41.1 

65.6 

10 

25.3 

45.7 

11 

31.6 

55.2 

12 

52.7 

138.6 

High 

1 

131.8 

628.0 

2 

131.1 

273.7 

3 

123.4 

231.4 

U 

123.2 

277.2 

5 

80.3 

262.3 

6 

88.6 

305.7 

7 

132.3 

249.2 

8 

123.7 

178.5 

9 

80.0 

171.6 

10 

90.5 

230.2 

11 

108.7 

156.8 

12 

83.8 

253.8 

Very  High 

1 

145.0 

270.0 

2 

297.9 

22.4 

3 

85.0 

1447.5 

4 

288.8 

238.5 

5 

159.3 

792.4 

6 

139.6 

586.1 

7 

264.4 

237.8 

8 

199.3 

283.4 

9 

95.2 

282.0 

10 

65.8 

252.3 

11 

145.8 

331.4 

12 

106.8 

525.3 
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TABLE  4:  SAMPLING  SCHEME  EFFICIENCY  -  PRECISION 

Percentage  of  Samples 
with  CIW  >±50%  of  ^l 

Relative  

Confidence  Thrice  Weekly 

IntCA'al  Width  CV  Weekly 


Very  High 

334% 

High 

92% 

Medium 

38% 

Low 

21% 

75% 

100% 

58% 

100% 

0% 

50% 

0% 

8% 

The  efficiency  of  thrice  weekly  sampling  versus  weekly  sampling  is  shown  in  Table  4. 
The  thrice  weekly  sampling  approach  achieves  our  assumed  goal  for  low  and  medium 
variable  effluents  whereas  weekly  sampling  does  not.  At  low  relative  variability  weekly 
sampling  fails  to  achieve  the  required  precision  on  1  out  of  12  samples  (8%).  At 
medium  variabilit\'  weekly  fails  6  out  of  12  samples  (50%).  At  higher  variability-  weekly 
fails  100%  of  the  time  and  thrice  weekly  fails  more  than  50%  of  the  time. 

3.2.3  Loading  Calculation 

The  loading  of  a  constituent  in  an  industrial  effluent  may  be  defined  as  the  rate  of  mass 

transport  expressed  in  units  of  mass  per  time.  e.g.  kg'day.  metric  tonnes  Vear.  etc. 

Loading  is  not  measured  directly,  rather  it  is  calculated  using  flow  and  concentration 

data. 

Various  methods  may  be  used  to  estimate  mean  monthly  loads.  In  the  case  of  dailv 
sampling  an  appropriate  method  would  be  to  sum  the  products  of  the  individual  mean 
daily  flows  and  concentrations  and  divide  by  the  total  by  the  number  of  samples. 

When  such  a  data  base  is  available  then  the  confidence  intervals  for  the  mean  can  be 
calculated  in  the  same  manner  as  discussed  previously  in  Section  3.2.2: 
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When  loading  is  determined  by  the  multiplication  of  monthly  means  (flow  and 
concentration)  the  reliability  of  the  calculated  value  is  found  by  applying  the  theory 
propagation  of  errors  (Overman  &  Clark,  1960). 

Assuming  that  concentration  and  flow  are  independent  variables,  the  variance  of  the 
load  ( (J^^)  is  obtained  by  the  equation: 

o[  =  c^do  ^Q^aJ  [1] 

where:  L  is  load 

C  is  concentration, 

Q  is  flow, 

(7  is  variance,  and 

X  denotes  multiplication. 

In  cases  where  concentration  and  flow  are  not  independent  variables  the  variance  of  the 
load  is  given  by: 

0[    =  c'^Gq^  Q'^Gc      +2QCxaQc  [2] 

where:  Gqc  is  the  flow-concentration  covariance  and  other  parameters  are  as  defined 
above. 

Equation  2  can  be  simplified  as  follows: 


o: 

(Jo 

Oc 

2Gqc 

C^Q^ 

Q' 

C' 

CQ 

=  CVq 

+  CVc' 

+  2  0qc 

CQ 
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ButL  =  QCso     O  O 

—   =    =  CV, 


'.     CVl     =  CVq     +  CVc     +2aQc  ■  PI 


L 

Where  CV  is  the  coefficient  of  variation. 

Similarly,  for  the  independent  case,  it  can  be  shown  that 

CV,2  =  CV^2  ,  cv,2  [4] 

The  above  relationship  can  be  illustrated  by  an  example.  Assuming  that  flow  and 
concentration  data  are  independent  then  [4]  can  be  used  and  the  mean  monthlv  load 
(L)  is  to  be  calculated  using  the  mean  monthly  flow  (Q)  and  mean  monthlv 
concentration  (C)  using  the  following  values: 


c 

4.2mg/L 

CVc     = 

334% 

Q        = 

13,088.9  m3/s 

CVo      = 

9.3% 

CxQ 

4.2  mg  y  13.088.9  m3 
L  S 

L  S 
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L 

5.5  X  10^  mg 
S 

CV, 

=     CVq2  +  cv^^ 

= 

9.3   '+  334' 

cv. 

=       334% 

If  the  coefficient  of  variation  of  the  flow  is  increased  to  25%  (e.g.,  less  accurate 
measurement  techniques)  and  the  calculations  repeated  then  the  CVl     is: 

25'  +  334 

CVl       =         335% 

In  other  words  when  the  coefficient  of  variation  is  high  for  one  element  of  the 
calculation  with  respect  to  the  other  then  that  element  is  predominantly  responsible  for 
the  variation  in  the  calculated  product.  In  such  cases,  reducing  the  accuracy  for  the 
parameter  with  a  very  low  CV  will  result  in  only  a  small  increase  in  the  CV  of  the 
calculated  product. 

3.2.4  Discussion 

A  comparison  of  the  suitability  of  weekly  and  thrice  weekly  sampling  for  achieving  the 

assumed  accuracy  and  precision  goals  is  given  below. 

Accuracy  Precision 

Weekly  -      suitable  for  low  -      not  suitable  for 

variability  any  level  of 

industries  variability 
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Thrice  -      suitable  for  low  -      suitable  for  low 

and  medium  and  medium 

variability  levels  variability  levels 

For  industries  with  high  levels  of  variability,  a  sampling  frequency  greater  than  thrice 
weekly  is  required. 

It  is  worthwhile  to  note  that  parameters  within  an  industry  will  have  different  variabilit) 
levels.  Thus,  a  particular  industry  may  very  well  require  different  sampling  frequencies 
for  each  parameter  or  group  of  parameters.  When  designing  a  program  with  different 
sampling  requirements,  it  is  possible  to: 

a)  design  for  the  most  demanding  parameter,  or 

b)  design  to  some  acceptable  middle  condition. 

Designing  to  the  most  demanding  parameter  will  ensure  that  the  required  data  are 
obtained  for  all  cases  although  costs  and  logistics  will  be  high  and  some  unnecessary- 
data  will  be  obtained.  Designing  to  some  middle  condition  will  reduce  costs  and 
logistics  but  may  result  in  loss  of  some  required  data. 

In  regards  to  approach  (a),  a  goal  of  uniform  precision  in  the  data  may  be  applicable  to 
the  MISA  program.  For  example,  sampling  at  a  thrice  weekly  frequency  for  the  highest 
variability  industry  will  produce  results  that  are  accurate  to  +X%.  Sampling  at  the 
same  frequency  at  a  low  variability  industry  will  produce  results  that  are  accurate  to 
+Y%  where  Y  <X.  If  we  accept  that  X  accuracy  can  equal  Y  accuracy,  then  the 
sampling  frequency  requirements  can  be  reduced  at  the  low  variability  industry.  The 
same  argument  would  apply  to  precision. 

Finally,  in  the  case  of  loading  estimates,  there  is  the  potential  for  reducing  the  required 
accuracy  of  flow  measurement  devices  where  the  CV  for  the  constituent  of  concern  is 
very  high  compared  to  the  CV  for  the  flow.  In  such  cases,  the  CV  for  the  computed 
load  will  be  increased  very  little.  However,  when  the  CV  for  the  constituent  is  low 
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decreasing  the  accuracy  of  flow  measurements  will  significantly  increase  the  CV  for  the 
calculated  load.  Reducing  the  accuracy  of  flow  measuring  devices  (i.e.  using  less 
accurate  equipment)  should  not  be  considered  for  a  sampling  point  unless  it  can  be 
demonstrated  that  the  CV's  for  all  constituents  are  high  in  relation  to  the  CV  for  flow. 

3.3  PRESENCE/ABSENCE  DETERMINATION 

The  monitoring  regulation  for  the  petroleum  refining  sector  refers  to  closed 
characterization  and  open  characterization.  The  purpose  of  closed  characterization  is 
to  identify  the  presence  of  compounds  on  the  EMPPL  that  are  not  routinely  measured. 
The  purpose  of  open  characterization  is  to  identify  the  presence  of  compounds  that  are 
not  on  the  EMPPL. 

This  section  first  explores  the  question  of  sampling  frequency  required  to  determine 
presence  or  absence  of  compounds  in  industrial  effluent  and  then  presents  an  example 
application  using  simulated  data  to  demonstrate  the  effect  of  different  data  set 
characteristics  on  the  ability  of  a  sampling  scheme  to  detect  the  presence  of  compounds. 

3.3.1  Application  of  the  Binomial  Distribution 

The  question  of  sampling  frequency  requirements  for  determining  the 

presence/absence  of  EMPPL  compounds  may  be  stated  statistically  as: 

"What  is  the  probability  of  obtaining  at  least  one  result  above  detection  limits?" 

The  binomial  distribution  with  the  assumptions  listed  below  is  used  as  a  starting  point: 

1.  the  probability  of  success  is  the  same  for  each  trial,  and 

2.  the  trials  are  independent. 

The  binomial  distribution  is  given  by  the  equation: 
b(x;n,^)  =        {i)e\l-9r 
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where:x  is  the  number  of  successes 

n  is  the  number  of  trials  and 

Q  is  the  probability  for  success  and  is  constant  from  trial  to  trial 
(Freund,  1962). 

If  p  is  the  probability  of  getting  a  success  then  q  is  the  probabilit)'  of  a  failure  and 
q  =  1-p.  If  we  chose  the  number  of  successes  as  zero  then  q  will  give  the  probability  of 
obtaining  at  least  one  sample  above  detection  limits.  In  practice,  binomial  probabilities 
are  rarely  calculated  directly  but  are  available  from  tables.  Probabilit)  for  various 
values  of  6  ,  q  and  N  derived  from  tables  contained  in  Freund  (1962)  are  shown  below. 


BINOMIAL  PROBABILITIES  FOR  DETECTING  AT  LEAST  ONE  SAMPLE 
ABOVE  DETECTION  LIMITS 

q 

0  N  =  12  N  =  6  N  =  2  N  =  l 

(Monthly)  (Bi-monthly)      (Twice  per  year)         (Once  per  year) 


.5  .9998  .9844  .7500  .5000 

.3  .9862  .8824  .5100  .3000 

.1  .7176  .4686  .1900  .1000 


These  results  are  also  shown  in  Figure  3. 

If  a  goal  of  the  project  were  to  design  a  program  that  would  always  insure  a  probabilit)' 
of  detecting  at  least  one  sample  above  the  detection  limit  then  Figure  3  can  be  used  to 
determine  the  sampling  frequency  for  different  values  of  6- 
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PROBABILITY    OF    AT    LEAST    ONE 
SAMPLE    BEING    ABOVE   DETECTION 
LIMITS 

FIGURE    3 

-Q^  -  Probability  of  presence  on    any   given     day    and  is   constant  year  round 
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For  example,  for  a  0  of  .5,  (i.e.,  there  is  a  .5  probability  that  a  compound  is  present  on 
any  given  day  and  this  probability  is  constant  year  round)  then  N  be  must  at  least  3  (e.g., 
sample  once  every  four  months)  to  yield  a  q  of  .8  (.8  probability  of  measuring  at  least 
one  value  above  detection  limit). 

If  6  is  0.3  then  N  must  be  5  (approximately  quarterly  sampling)  and  if  ^  is  .1  then  N 
must  be  16  (approximately  once  every  three  weeks). 

The  value  of  9  will  var)'  from  industry'  to  industry  and  from  parameter  to  parameter 
within  an  industry.  The  majority  of  the  parameters  on  an  industry's  EMPPL  will  have 
values  equal  to  1  (always  detected,  e.g.  suspended  solids)  or  almost  equal  to  1  (e.g. 
chromium  was  above  detection  limits  on  80%  of  samples  thus    0  =  .8). 

All  industrial  sectors  (and  many  industries)  will  undergo  pre-regulation  monitoring  for 

characterization  purposes.  Assuming  that  at  least  4  samples  are  obtained  from  pre- 
monitoring  the  binomial  distribution  tells  us  that  there  is  a  95%  probability'  that  at  least 
one  above  detection  limit  value  will  have  been  obtained  for  Q  >.5. 

The  parameters  that  are  of  interest  in  characterization  are  the  ones  with  lowest  9 
and  thus  the  one  that  are  the  hardest  to  detect  (i.e.  will  require  greater  sampling 
frequency). 

Figures  4  and  5  illustrate  the  effect  of  different  values  of  9  for  quarterly  sampling  and 
monthly  sampling,  respectively.  Once  again  assuming  that  0.8  is  the  minimum 
probability  acceptable  for  the  program,  Figure  4  shows  that  quarterly  sampling  would 
be  suitable  for  parameters  which  have  a  0  >.32.  Similarly,  Figure  5  shows  that  monthly 
sampling  would  be  appropriate  for  parameters  where  6  >.\2 

3.3.2  Serial  Correlation 

TTie  binomial  distribution  was  used  in  the  previous  section  to  identify  appropriate 
sampling  frequency  required  to  identify'  presence/absence  of  a  compound.  In  this 
section,  an  empirical  technique  is  used  to  investigate  a  complicating  factor  that  is  not 
accounted  for  in  the  application  of  the  binomial  distribution,  namely;  serial  correlation. 
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FIG    4. BINOMIAL   PROBABILITY 
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Serial  correlation  (sometimes  referred  to  as  autocorrelation)  is  the  degree  to  which  a 
data  set  is  related  to  itself  at  some  specified  lag  period.  In  practical  terms,  this  may  be 
thought  of  as  the  correlation  coefficient  of  a  data  set  with  itself  after  it  has  been  shifted 
by  a  lag  unit  (often  1).  Serial  correlation  is  denoted  by  the  autocorrelation  coefficient  r. 
Industrial  effluent  data  often  exhibit  serial  correlation  at  lag  =  1. 

Serial  correlation  was  selected  for  further  study  in  order  to  test  the  sensitivity  of  the 
assumption  of  independence  in  the  binomial  distribution. 

A  series  of  MINITAB  macros  (Appendbc  B)  were  developed  by  Dr.  R.  Green  of  the 
University  of  Western  Ontario  to  simulate  and  sample  data  sets  to  test  the  efficiency  of 
various  sampling  schemes  to  identify  above  detection  limit  values.  The  results  obtained 
were  used  to  develop  curves  relating  the  percentage  of  above  detection  limit  runs 
identified  for  various  sampling  schemes  and  autocorrelated  values. 

This  approach  was  based  on  the  use  of  a  binary  data  set  (i.e.,  a  scries  of  ones  (I's)  and 
zeros  (O's)  to  represent  the  true  presence  (1)  or  the  true  absence  (0)  of  a  compound. 

Figure  6  is  a  representation  of  a  hypothetical  time  series  in  both  analog  and  binary 
forms.  (ITiis  is  not  the  actual  data  base  used  in  subsequent  analysis,  but  a  simplified 
version  used  here  for  illustration  purposes.)  The  "true  level"  is  represented  by  the  thick 
line  which  can  be  seen  to  vary  in  magnitude  with  time.  Regular  sampling  produces  the 
series  of  ticks  along  the  true  level.  The  size  of  the  ticks  represents  the  combined 
variability  associated  with  sampling  analytical  and  other  estimation  errors.  The 
horizontal  line  is  the  minimum  detectable  level.  This  is  shown  to  be  constant  although, 
in  reality,  this  too  could  change  daily. 

The  binary  coding  relates  to  presence/absence  rather  than  the  true  level.  It  may  be 
thought  of  as  true  presence/absence.  The  binary  data  set  is  obtained  by  checking  to  sec 
whether  the  true  level  is  below  the  minimum  detectable  level  (absent  =  0)  or  above 
(present  =  1).  A  series  of  I's  is  called  a  run  and  may  be  thought  of  as  a  violation  event 
or  pollution  event.  A  run  can  be  length  one  or  greater. 
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The  pattern  of  variations  in  the  true  presence/absence  can  be  adequately  modelled  by 
varying  two  parameters  in  a  binary  time  series,  i.e,  the  mean  ( 0 )  of  the  true 
presence/absence  and  the  autocorrelation  coefficient  (r).  (Note:  9  is  used  in  place  of  ft 
to  facilitate  comparison  with  the  Binomial  Distribution.)  In  the  case  of  the  binary  data 
base  the  mean  ($)  is  also  the  probability  of  a  violation  occurring  on  a  given  day.  The 
mean  can  vary  from  0.0  (total  absence)  to  1.0  (total  presence). 

Autocorrelation  (r)  is  the  correlation  of  a  time  series  variable  with  itself  specifying  a  lag 
factor,  in  this  case  one.  An  autocorrelation  coefficient  of  zero  would  indicate  no 
relationship  between  one  day's  concentration  and  the  next  (i.e.,  this  would  satisfy  the 
assumption  of  independence  in  the  binomial  distribution). 

The  MINITAB  macros  (contained  in  Appendix  B)  were  used  to  simulate  a  binary  data 
set  and  to  examine  the  effects  of  serial  correlation  on  the  efficiency  of  sampling 
schemes  to  identify  above  detection  limit  values.  The  data  were  considered  to 
represent  presence/absence  of  a  compound  on  a  weekly  basis.  The  total  length  of  the 
simulation  was  4  years  (N  =  210).  The  program  uses  a  6   value  of  .5  and  four  different 
values  of  r. 

The  effects  of  serial  correlation  in  the  simulated  data  set  on  the  efficiency  of  various 
sampling  schemes  in  identifying  above  detection  limit  runs  is  shown  in  Figure  7.  At 
sampling  frequencies  of  quarterly  and  greater  positive  serial  correlation  resulted  in 
increased  ability  to  identify  above  detection  limit  runs  whereas  negative  serial 
correlation  resulted  in  decreased  ability.  At  a  frequency  of  twice  per  year  an 
autocorrelation  coefficient  of  ♦.6   increased  the  ability  to  identify  runs  whereas  no 
difference  was  observed  for  other  values  of  r. 

Further  simulations  and  use  of  different  values  of  ^  could  be  run  to  refine  the  analyses; 
however,  it  is  unlikely  that  the  general  findings  above  would  change. 
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THE    EFFECT  OF   SERIAL    CORRELATION    ON 
SAMPLING    EFFICENCY 

FIGURE    7 
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3.3.3  Example  Applications 

Nineteen  data  sets  were  generated  by  the  "SIMULATE"  program  and  sampled 
repeatedly  to  test  the  ability  of  various  sampling  schemes  to  detect  presence  of  a 
contaminant.  The  data  sets  sampled  included  both  approximately  normal  (4)  and  non- 
normal  (15)  distributions.  The  coefficients  of  variations  ranged  from  8%  to  154%.  In 
terms  of  relative  variability  levels  as  defined  in  Section  3.2.1,  the  distribution  was  as 
follows: 

Relative  Number  of 

Variability  Data  Sets 


Low  g 

Medium  g 

High  7 

Very  High  0 

All  data  sets  had  positive  autocorrelation  at  lag  =  1. 

Each  data  set  was  sampled  ten  times  for  each  sampling  frequency  and  6  value  and  the 
results  averaged.  By  changing  the  detection  limit  for  a  data  set  a  new  value  for  6 
(probability  of  a  contaminant  being  preseni)could  be  derived.  The  average  number  of 
detections  per  $  value  and  per  sampling  frequency  was  then  calculated.  The  Q  value 
where  detection  first  occurred  was  then  selected  by  manual  methods.  The  time  series 
plots  and  summary  statistics  for  each  data  set  are  shown  in  Appendix  CI  and  C2, 
respectively. 

Next  a  matrix  was  derived  containing  the  minimum  6  value  that  a  sampling  frequency 
was  capable  of  detecting  for  each  data  set.  Finally,  the  nineteen  data  sets  were 
averaged  and  the  results  graphed  (see  Figure  8).  The  best  fit  line  in  the  graph  has  been 
drawn  "by  eye". 

The  graph  shows  that  the  average  6  value  where  detection  first  occurs  declines  as  the 
number  of  samples  obtained  increases,  as  one  would  expect.  Annual  and  semi-annual 
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THETA    VERSUS    NUMBER    OF    SAMPLES 
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sampling  do  not  appear  capable  of  reliably  detecting  presence  when  9  is  less  than  0.4. 
(It  should  be  noted  that  these  simulations  did  detect  presence  at  very  low  6  values  on 
occasion  but  not  consistently. ) 

At  sampling  frequencies  greater  than  about  six  times  per  year  (or  bi-monthly),  the 
value  decreases  slowly  to  a  minimum  of  about  0.15  for  monthly  (12)  sampling. 

Figure  8  was  used  to  estimate  the  number  of  samples  (N)  required  to  identify  at  least 
one  above  detection  limits  for  various  6  values.  The  results  are  summarized  below 
along  with  the  associated  binomial  distribution  probabilities  (q  )  for  these  0  .  and  N 
values. 

Binomial  Probability 
e  N  (q) 


.15  12  85% 

.20  7  79% 

.30  4  76% 

.40  2  64% 

3.3.4  Discussion 

The  binomial  distribution  appears  to  be  a  reasonable  and  simple  method  for 
determining  sampling  frequency  requirements  for  a  program  to  identify 
presence/absence  for  compounds.  Violation  of  the  independence  assumption  does  not 
appear  to  significantly  alter  results.  Serial  correlation  can  potentially  alter  the  ability  of 
a  sampling  scheme  to  detect  presence,  for  example,  positive  correlation  enhances  the 
ability  to  detect  presence.  Since  most  data  sets  will  be  positively  correlated  to  some 
degree  the  binomial  distribution  would  tend  to  be  slightly  conservative  in  this  regard. 
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Use  of  the  empirically  derived  Figure  8  to  design  a  program  and  comparison  of  results 
to  the  binomial  distribution  indicated  that  monthly  sampling  would  be  required  for  $ 
as  low  as  .15  and  quarterly  sampling  would  be  required  for  Q  as  low  as  .30.  Semi- 
annual sampling  would  be  adequate  only  where  B   is  >.4  and  a  lower  probability  of 
detection  is  acceptable  (i.e.  about  64%). 
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4.0  GENERAL  PROTOCOL  FOR  ESTIMATING  SAMPLING  FREQUENCY 

The  findings  discussed  in  Chapter  3  relied  on  simulated  data  and  broad  classes  of 
industrial  variability  levels.  The  next  logical  step  to  address  the  specific  issues  of 
sampling  frequency  of  interest  in  this  study  would  be  the  analysis  of  actual  industrial 
effluent  data  using  the  techniques  presented  earlier  and  other  methods  described 
below. 

Ward  et  al  (1987)  outlined  a  framework  for  complete  design  of  a  monitoring  network 
that  builds  on  previous  experience.  The  following  general  protocol  utilizes  several  of 
Ward's  recommendations  as  well  as  techniques  presented  by  other  authors  and  is 
intended  to  serve  as  a  guide  for  possible  future  investigations. 

1.  Review  MISA  program  objectives  and  restate  these  in  statistical  terms.  The 
methods  by  which  BATEA  effluent  guidelines  will  be  developed  and  how  the 
monitoring  data  will  be  used  should  be  specified  in  precise  terms.  When  this  is 
achieved,  it  should  be  possible  to  specify  the  required  accuracy  for  mean 
monthly  load  estimates,  etc. 

2.  Refine  or  expand  relative  variability  levels  for  different  industries  and 
parameters.  Identify  representative  industries  and  their  associated  d  ,  p.  and 
CV  statistics  (as  explained  in  Chapter  3). 

3.  Obtain  suitable  data  bases  representative  of  the  relative  variabilit)'  levels  and 
parameters  identified  in  step  #2. 

4.  Analyze  the  data  base  to  characterize  it  statistically.  This  should  include 
estimation  of  the  population  mean,  standard  deviation,  seasonal  variability, 
serial  correlation  and  (if  trend  analysis  or  testing  of  statistical  hypothesis  is  to  be 
undertaken)  identify  the  applicable  probability  distribution. 
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A  procedure  for  identifying  periodic  trends  in  time  series  data  is  contained  in  a 
paper  by  Loftis,  et  al.,  1987.  This  procedure  uses  a  correlogram  to  visually 
display  periodic  trends.  The  correlogram  feature  was  included  as  an  option  in 
the  DESIGN  program  contained  in  Appendix  B.  Several  other  useful  graphical 
techniques  and  simple  statistics  that  can  be  used  to  interpret  environmental  data 
are  contained  in  a  paper  by  Berthouex,  et  al.,  1981.  These  features  include: 

1.  scattergram  of  the  time  series  (included  as  an  option  in  the  DESIGN 
program), 

2.  histograms  of  absolute,  relative  and  cumulative  frequencies  (included  in 
the  DESIGN  program), 

3.  calculation  of  simple  statistics  such  as  mean,  standard  deviation, 
maximum,  minimum,  range,  coefficient  of  variation  and  autocorrelation 
coefficient  (included  in  the  DESIGN  program), 

4.  moving  averages,  and, 

5.  cumulative  sum  chart. 

Many  other  statistical  procedures  are  available  for  time  series  pattern  definition. 
These  features  are  found  in  most  commercially  available  statistical  analysis 
packages.  For  a  description  of  these  procedures,  the  reader  is  referred  to  any 
statistical  text  on  time  series  analysis. 

5.         For  many  statistical  tests  (e.g.  comparison  of  means  between  industrial  sectors) 
use  the  information  from  step  #4  to  select  the  appropriate  probability 
distribution  function  to  match  the  test  requirements  to  the  population 
characteristics. 


36 


In  hydrology  (and  water  quality  aspects  of  hydrology),  it  has  been  common 
practice  to  assume  water  constituents  are  normally  distributed  or  that 
transformed  data  are  normally  distributed  (e.g.  Ward,  et.  al.,  1979;  Loftis,  et.  al., 
1987).  Under  assumptions  of  normality  subsequent  analysis  are  simplified. 

Water  quality  constituents  may  have  probability  density  functions  (PDF's)  or 
distributions  other  than  normal  (e.g.  Poisson,  log-normal,  negative  binomial, 
uniform,  Pearson  Type  III,  etc.)  or  the  PDF's  may  be  ill  defined  or  not  defined 
(distribution  free).  In  such  cases,  application  of  statistics  for  normal 
distributions  will  not  apply. 

If  a  known  PDF  can  be  fitted  to  the  data  then  the  statistics  for  that  PDF  can  be 
used.  If  no  fit  is  found  then  the  data  are  said  to  be  distribution  free  and  non- 
parametric  techniques  are  used  in  statistical  tests. 

The  "SIMULATE"  program  includes  two  tests  for  measuring  normality,  i.e., 
skewness  and  kurtosis.  Other  measures,  such  as  plotting  the  data  on  probability 
paper  and  Kolmogorov  -  Smirnov  statistics  are  available  through  commercial 
statistics  packages. 

The  simulated  data  used  in  the  examples  in  this  report  include  data  with 
approximately  normal  PDF  and  distribution-free  data. 

6.         Repeat  the  analysis  described  in  Chapter  ^  0  nsinp  actual  data. 


37 


7.         Based  on  the  binomial  distribution  and  empirically  derived  relationships 

between  9  (the  probability  that  a  constituent  is  present  above  detection  limits) 
and  sampling  frequency  the  following  requirements  were  identified  (assuming 
that  80%*  is  the  minimum  acceptable  probability  for  detection). 

Frequency  Suitable  for 

9 
'     Monthly  >  .15 

Bi-monthly  >  .25 

Quarterly  >  .35 

Semi-annual  >  .50  or  (*at  75%)  or  7.40  (*at  64%) 

5.2  RECOMMENDATIONS 

1.  Review  MISA  objectives  and  restate  these  in  statistical  terms.  Specifically, 
determine  accuracy  and  precision  requirements  for  identification  of  BATEA 
effluent  limits.  Also  specify  the  minimum  ff   that  the  characterization  program 
should  detect  and  the  minimum  probability  for  detection  that  is  acceptable. 

2.  Obtain  industrial  data  bases  and  perform  the  analysis  outlined  in  Chapter  3  to 
test  the  results  of  this  study. 


Respectfully  submitted, 
GARTNER  LEE  LIMITED 


J.E.  O'Neill,  B.Sc. 
Hydrologist, 
Senior  Consultant 
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GLOSSARY  OF  TERMS 

Accuracy  is  the  closeness  to  the  true  value  of  the  quantity  being  measured  or  to  an 
accepted  reference  value. 

Autocorrelation  coefficient  is  a  measure  of  the  degree  to  which  a  data  set  is  related  to 
itself  at  some  specified  lag  period,  usually  one  day.  Autocorrelation  is  sometimes 
termed  serial  correlation.  The  presence  of  autocorrelation  in  a  data  set  violates  the 
assumption  of  independence  upon  which  many  statistical  techniques  depend. 

BASIC  is  a  computer  programming  language. 

BATEA  stands  for  Best  Available  Technology  that  is  Economically  Achievable. 

Coefficient  of  variation  is  the  ratio  of  the  standard  deviation  to  the  mean  and  is  usually 
expressed  as  a  percentage. 

Confidence  intenvl  is  the  range  in  which  the  true  value  (e.g.  the  mean)  is  expected  to 
fall  with  a  certain  confidence  level. 

Confidence  interx'al  width  is  the  numerical  extent  or  range  of  the  confidence  limits.  For 
example,  if  the  confidence  limits  were  +  5  mg/L  then  the  confidence  interval  width 

would  be  10  mg/L. 

Confidence  level  is  quantitative  expression  of  the  reliability  of  an  estimated  value.  The 
expression  is  usually  stated  in  probability  terms. 

Confidence  limits  are  the  numerical  values  that  express  the  ends  of  the  confidence 

interval. 

Kurtosis  is  the  degree  of  peakedness  of  a  distributor,  usually  taken  relative  to  a  normal 
distribution. 


Macros  are  a  set  of  instructions,  (e.g.,  in  MINITAB)  created  by  a  computer  user  to 
perform  specific  functions  (e.g.,  statistical  calculations  or  mathematical  manipulations. 
They  may  be  thought  of  as  a  computer  program  written  in  the  language  of  a  particular 
software  package. 

Mean.  The  mean  is  a  descriptor  of  the  central  tendency  of  a  set  of  observations.  The 
arithmetic  mean  (the  most  common  measure)  is  the  sum  of  the  obser\'ations  divided  by 
the  number  of  observations. 

MINITAB  is  a  commercially  available  statistical  analysis  package. 

Non-normal  refers  to  a  probability  densit)  function  that  is  not  normal.  The  pdf  may  be 
some  other  recognized  distribution  or  it  may  be  undefined. 

Normal  or  normally  distributed  refers  to  a  probability  density'  function  (pdf)  which  has 
the  characteristics  of  the  normal  pdf. 

Precision  is  the  variation  in  an  obser\'ation  or  set  of  observations  due  to  random  error. 
It  is  the  measure  of  the  repeatabilit\'  of  a  series  of  observations  or  measurements. 

Relative  error  in  estimating  the  mean  is  a  measure  of  the  efficiency  of  a  sampling  scheme 
to  estimate  a  mean  value. 

Relative  precision  in  estimating  the  mean  is  a  measure  of  the  relative  width  of  the  95% 
confidence  interval. 

Serial  correlation:  See  autocorrelation  coefficient. 

Skewness  refers  to  the  shape  of  a  probability  density  function  (usually  compared  to  the 
normal  distribution).  If  the  longer  tail  occurs  to  the  right  the  distribution  is  said  to  be 
skewed  to  the  right  or  to  have  positive  skewness. 
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Significance  level  is  the  maximum  probability  with  which  we  would  be  willing  to  risk  a 
Type  I  error  when  testing  a  given  hypothesis. 

Standard  deviation  is  a  descriptor  of  the  variability  of  a  set  of  observations. 

Variability  is  the  degree  that  a  value  such  as  effluent  greatly  changes  over  time  or  space. 
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LO  INDUSTRIAL  EFFLUENTS 

Water  quality  variability  is  a  key  factor  in  the  determination  of  sample  frequency 
requirements.  Of  particular  interest  in  this  study  is  variabilit)'  of  constituents  which 
occurs  as  a  result  of  changes  in  effluent  quality  and  quantity. 

The  range  of  conditions  possible  in  effluent  from  Ontario  industries  in  large  and  to  a 
great  extent,  unknown.  The  characterization  of  industrial  effluents  is,  in  fact,  one  of 
MISA's  objectives. 

An  overview  assessment  was  undertaken  by  Zenon  Environmental  Inc.  (Canning,  1988) 
to  provide  insight  and  background  information  concerning  the  range  of  variabilit}'  likely 
to  be  encountered  in  industrial  effluent  quantity  and  quality. 

Variability  in  effluent  quality  will  range  from  low  to  high  generally  in  response  to: 

•  Degree  of  effluent  treatment.  Higher  levels  of  treatment,  particularly  where 
equalization  and  biological  treatment  are  included,  tends  to  reduce  effluent 
quality  variations  and/or  lengthen  their  periodicit}'  under  normal  operating 
conditions.  Plant  upsets  can  cause  levels  to  increase  dramatically  within  short 
periods  of  time  (hours). 

•  Batch  vs.  continuous  processing.  Batch  processing  methods  tend  to  increase 
variations  in  effluent  quality.  The  smaller  the  batch  size,  the  higher  the 
frequency  of  these  variations. 


•  Plant  size  and  degree  of  product  mix.  Small  plants  which  produce  a  wide  range 
of  products  tend  to  have  effluent  which  varies  widely  in  quality  over  relatively 
short  periods  (hours  or  days). 

Variability  in  effluent  quantity  varies  from  low  to  high  generally  in  response  to: 

•  Continuity  of  processing  operation  (1  shift,  2  shifts  or  3  shifts  per  day;  5  or  7  days 
per  week).  Batch  processes  will  generally  produce  high  variability  over  shorter 
durations  compared  to  continuous  processes. 

•  Availability  of  equalization  and/or  effluent  storage  facilities  which  tend  to 
reduce  flow  variations  and  lengthen  their  periodicity. 

•  Seasonal  variations  will  occur  at  plants  which  have  large  areas  of  unpaved 
controlled  surface  drainage,  e.g.  mine  sites. 

•  Daily  and/or  weekly  flow  variations  will  occur  in  response  to  rainfall  events  at 
plants  which  have  large  paved  areas  which  contribute  potentially  contaminated 
stormwater,  e.g.  refineries,  petrochemical  plants.  Extent  of  dampening  will 
depend  on  availability  of  storm  surge  and/or  equalization  ponds. 

Table  1  presents  nine  examples  of  industries  which  represent  a  broad  range  of  effluent 
quality  and  quantity  conditions.  These  broad  groupings  formed  the  basis  for 
subsequent  investigation  for  estimates  of  mean  concentration  and  loads  and 
presence/absence  of  compounds  presented  in  Chapter  3  of  Volume  1  of  the  report. 
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2.0  IDEAL  DATA  BASE  CHARACTERISTICS 

In  ideal  terms,  a  preliminan,'  data  set  which  can  be  used  to  determine  sampling 
frequency  requirements  in  a  program  should  have  the  following  characteristics: 


long  time  series, 

vety  frequent  sampling, 

regular  sampling  intervals. 

good  and  consistent  sample  collection  techniques 

good  and  consistent  analysis  methods. 

low  and  consistent  detection  limits,  (no  censored  data) 

wide  range  of  parameters,  and 

on  suitable  electronic  format 


Support  staff  of  the  MISA  Advisors  Committee  assisted  in  determining  the  status  and 
availability  of  data  bases  that  could  be  used  as  a  source  of  "preliminar\-  data"  and  form 
the  basis  for  determining  sampling  frequeno.  requirements.  The  results  are  shown  in 
Table  2. 
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PROJECT:  88-19A 

Table  1:   OVERVIEW  ASSESSMENT  OF  INDUSTRIAL  EFFLUENT  VARIABILITY 


Variabi I i ty 
in  Effluent 
Qua  I ity 


Variabi I ity 
in  Effluent 
Quantity 


Example 
Industries 


High 


High 


High 


High 


High 


High 


Petroleun  Refining 


Base  Hetal  Mining 


Organic  Chemical 
Manufacturing 
(single  product; 
batch  processes) 


Pulp  &  Paper 


Inorganic  Chemical 
Manufacturing 
(large,  continuous 
processes,  for  high 
use  ind.  chemicals) 


Inorganic  Chemical 
(batch  processes) 


Organic  Chemical 
Manufacture 
(continuous 
processes) 

Organic  Chemical 
Manufacture 


Organic  Chemical 
Manufacture 


Continuous  24  h/d;  7  d/wk  operation  plus  high 
level  of  effluent  treatment  (inc.  biological 
treatment)  tends  to  produce  effluents  which  vary 
over  a  relatively  narrow  range  of  quality  and 
quantity  compared  to  other  industry  groups. 

Large  tailing  ponds  will  tend  to  dampen 
variability  in  effluent  quality  with  time  but 
large  areas  which  collect  contaminated  surface 
drainage  will  cause  quantities  to  vary  day  to  day 
in  response  to  rainfall  events  and  seasonally  in 
response  to  snow  melt. 

Industries  which  produce  basically  the  same 
product  in  batches  on  a  repetitive  basis  (e.g. 
phenolic  resin  manufactures)  will  tend  to  have 
effluents  whose  quality  changes  very  little  with 
time  but  which  may  vary  widely  (hour  by  hour)  in 
volune  in  response  to  batch  production  schedules. 

Continuous  24  h/d;  7  d/wk  operation  plus  lagoon 
based  biological  treatment  tends  to  moderate 
variations  in  effluent  quality  and  dampen 
effluent  flow  fluctuations. 

Larger,  continuous  production  facilities  for  high 
use  industrial  chemcials,  e.g.  NaOH,  H  SO,  HCl, 
HNO  would  be  expected  to  show  moderate  variability 
in  terms  of  both  effluent  quantity  and  quality. 
Organics  when  present  tend  to  be  at  or  below 
detectable  limits.  pH  is  main  effluent  control 
parameter.  Excursions  can  last  minutes  to  hours. 

Generally  applies  to  batch  processes  without 
equalization  ponds  or  effluent  holding  tanks. 
Pattern  of  variability  will  range  from  low 
(weekly  to  monthly)  to  high  (hourly  or  daily) 
depending  on  batch  size  and  product  mix.  The 
larger  the  product  mix  and  smaller  the  batch  size 
-  the  more  frequent  the  variations. 

Generally  applies  to  mid  and  large  size 
continuous  processing  plants  which  manufacture 
commodity  and  bulk  chemicals,  e.g.  industrial 
solvents  (aromatic,  aliphatic,  chlorinated). 

Generally  applies  to  mixed  continuous  and  batch 
processes,  e.g.  synthetic  rubbers,  fibers, 
plastics, etc. 

Generally  applies  to  batch  produced  specialty 
chemical  manufacturing  with  large  product  mix 
e.g.  pharmaceuticals,  paints,  dyes  and  inks, 
etc.  Minimum  effluent  treatment  will  enhance 
variability  of  effluent  quantity  and  quality. 
Quantity  and  quality  of  (indirect)  effluent 
usually  varies  on  a  continuous  basis  from  zero  to 
maximum  depending  on  activity  in  plant. 
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1.0  INTRODUCTION 

This  appendix  contains  documentation  for  the  set  of  computer  programs  developed  for 
the  report:  "Statistical  Assessment  of  Sampling  Frequency  Requirements  for  Selected 
Aspects  of  the  MISA  Program".  For  convenience,  the  complete  set  of  programs  and 
data  files  are  referred  to  by  the  name  "DESIGN"  (a  reference  to  the  design  of  water 
quality  monitoring  programs). 

DESIGN  consists  of  two  parts.  The  first  part  is  a  stand-alone  executable  program 
written  in  Microsoft  Quick  Basic  Version  4.0.  The  second  part  is  a  series  of  MINITAB 
macro  files.  MINITAB  is  a  commercially  available  statistical  analysis  package.  Macro 
files  are  actually  data  sets  which  contain  instructions  for  the  MINITAB  programs.  The 
two  parts  of  DESIGN  are  complemcntar}'.  For  example,  data  sets  generated  by  the 
BASIC  program  can  be  analyzed  by  MINITAB  Macros.  The  programs  allow  for 
generation  and  analysis  of  water  quality  concentration  data  with  prescribed  statistical 
characteristics,  e.g.,  well-defined  (approximately  normal)  distributions  versus  ill-defined 
(non-normal)  distributions,  low  versus  high  variability  and  different  detection  limits. 
The  programs  also  allow  for  input  of  data  from  external  sources  for  analysis,  e.g.,  real 
data  generated  through  the  MISA  program  or  elsewhere. 

Program  options  for  the  first  part  are  accessed  through  a  menu  system  as  shown  in 
Figure  1.  Both  program  parts  give  the  user  instructions  interactively. 


FIGURE  1:  MENUS  IN  THE  "SIMULATE"  PROGRAM 


a)  Main  Menu 


MENU 


CHANGE  DEFAULT  SETTINGS 

DRAW  POLLUTOGRAPH 

LOAD  DATASET 

CALCULATE  FREQUENCY  DISTRIBUTION 

CALCULATE  SUMMARY  STATISTICS 

WRITE  DATA  TO  FILE 

REDRAW  POLLUTOGRAPH 

ANALYSIS  PROGRAMS 

QUIT 


8   9   -> 


b)  Options  Menu 


MENU 


1)  INITIALIZE  PRINTER  SETTINGS 

2)  SET  TINY  PRINT 

3)  DRAW  DELAY  FACTOR 

4)  RETURN  TO  MAIN  MENU 


4   -> 


c)  Analysis  Menu 


1) 
2) 
3) 
4) 
5) 


MENU 


CONFIDENCE  INTERVAL  FOR  MEANS 

PRESENCE/ABSENCE  OF  CONSTITUENTS 

CORRELOGRAM 

CONFIDENCE  INTERVAL  (FULL) 

RETURN  TO  MAIN  MENU 


5   -> 


2.0  PURPOSE 

The  main  objective  of  this  set  of  computer  programs  is  to  generate  monitoring  scenarios 
which  demonstrate  the  following: 

•  the  effects  of  standard  deviations  on  estimates  of  mean  concentrations  and 
confidence  intervals, 

•  the  effects  of  detection  limits  on  identifSing  presence  or  absence  of  constituents. 

Since  only  limited  data  are  available  at  present  for  analyses,  a  secondary  objective  of 
the  model  was  to  generate  artificial  data  sets  of  predetermined  characteristics  which 
can  be  used  for  analysis  and  to  gain  insight  on  sampling  frequenc)'  requirements. 


3.0  HARDWARE  AND  SOFTWARE  REQUIREMENTS 

3.1  BASIC  PROGRAMS 

The  following  minimum  equipment  and  software  are  required  to  run  the  BASIC 
programs. 

An  IBM  PC  or  equivalent  computer  with: 

one  DS/DD  360  K  Byte  drive 

one  Hard  Disk  (must  be  designated  "C"  Drive) 

colour/graphics  display  adapter 

colour  or  composite  monitor 

256  K  Byte  memory 

DOS  Version  3.0  or  higher 

graphics  printer 

The  following  are  not  required  but  will  enhance  program  performance  and/or  function. 

IBM  AT  (or  equivalent) 

math  coprocessor 

joystick  and  games  serial  port 

LOTUS  1-2-3  Version  2.0  or  greater 

3.2  MINITAB  MACROS 

The  minimum  requirement  to  run  this  set  of  programs  is  the  MINITAB  statistical 
analysis  program.  The  hardware  and  other  requirements  to  run  MINITAB  are 
specified  in  their  product  literature. 


4.0  INSTALLATION 

One  DS/DD  diskette  labelled  "DESIGN  PROGRAMS  (88-194)"  is  included. 

The  diskette  contains  three  sub-directories  as  follows: 

DESIGN  -  (Part  1)  contains  BASIC  programs  and  supporting  files. 

MTAB  -  (Part  2)  contains  MINITAB  macros. 

DATA  -  simulated  Data  sets  used  in  the  report. 

To  load  Part  1,  the  BASIC  programs,  follow  these  instructions: 

1.  Turn  on  computer  in  the  normal  way. 

2.  Determine  if  there  is  a  sub-directory  under  the  root  directory  of  drive  C  named 
"DESIGN".  If  there  is,  rename  it. 

3.  Once  you  are  sure  there  is  no  sub-directory  called  "DESIGN" 
type  A: 

type  CD/DESIGN 
type  INSTALL 

4.  The  installation  program  will  create  a  directory  C:\DESIGN  and  copy  the 
relevant  files  into  it. 

5.  The  current  drive  will  be  C:\DESIGN.  To  run  the  program  type  SIMULATE. 
Specific  user  instructions  arc  provided  in  Section  5.0  of  this  Appendix. 

Part  2,  MINITAB  macros  can  be  copied  into  your  MINITAB  subdirectory  as  follows: 
(Note:  you  must  have  the  MINITAB  program  to  continue) 


1.  Type  CD  C:\MTAB  (assuming  MTAB  is  the  subdirectory  containing  MINITAB) 

2.  Type  Copy  A:\MTAB\* .  * 


5.0  USER  INSTRUCTIONS 

5.1  RASTC  PROGRAMS 
5.1.1  Data  Preparation 

The  DESIGN  -  Part  1  program  can  generate  artificial  data  or  import  external  files  for 
analysis  in  various  ways.  Before  proceeding  with  any  session  you  must  either  already 
have  data  for  analysis,  or  you  must  generate  it. 

5.1.1.1  Import  Data  Files 

To  import  data  it  must  be  in  an  ASCII  file  in  the  C:\DESIGN  sub-director>'  and  named: 

DATA002?.PRN 

"??"  may  be  any  two  letter  A  -  Z  or  number  0  -  9  combination.  No  distinction  is  made 
on  the  case  of  the  letter,  thus  a  maximum  of  1296  data  sets  are  permitted  at  any  one 
time.  (If  an  existing  data  set  name  is  given  when  the  program  asks  for  it  the  old  data  set 
will  be  overwritten). 

The  information  in  the  data  set  must  represent  daily  conditions  (e.g.,  mean,  min,  or 
max).  The  data  set  consists  of  365  rows  (or  records)  containing  numerical  values  (in 
decimal  format)  which  represent  daily  concentrations.  No  delimiters,  e.g.,  ","  are 
allowed.  Example  data  are  included  on  the  distribution  disk  in  the  file: 
DATAOONPRN.  To  obtain  a  printout  of  the  data  enter  TYPE  DATAOON.PRN>PRN. 
This  data  set  is  used  in  the  verification  process  discussed  later. 

If  you  wish  to  input  a  data  file  which  does  not  meet  the  above  requirements  it  will  be 
necessary  for  the  user  to  modify  the  original  data  into  suitable  form.  For  example,  if 
there  are  missing  data,  these  should  be  estimated.  If  there  arc  more  than  365  values 
then  a  subset  of  365  values  should  be  selected.  If  multiple  values  exist  for  any  day, 
these  should  be  averaged  and  the  average  values  entered. 

If  the  data  file  to  be  input  meets  the  above  requirements  it  can  be  loaded  for  analysis  by 
typing  in  the  "RUN  #"  (i.e.,  0-9  or  A-Z)  at  the  program  request. 


If  the  data  file  does  not  exist  you  will  be  returned  to  the  previous  menu.  If  the  data  file 
exists  but  has  incorrect  format  an  error  will  be  produced  and  the  program  will 
terminate. 


EXAMPLE  SESSION 


Type 


C:\DESIGN 
SIMULATE 


(comments) 
change  directories 
this  starts  the  program 


select  input  data  file 
from  main  menu 

select  run  #1,  i.e., 
DATAOOLPRN. 

display  of  x-y  graph 
of  file 

selects  further  analysis 
programs 


etc. 


follow  program  prompts. 


5.LL2  Generate  Data  Files 

An  important  feature  of  the  program  is  that  the  user  is  provided  with  the  option  of 
generating  artificial  data  for  analysis.  This  feature  requires  a  joystick  and  games 
adapter  to  be  installed  in  the  computer. 


This  option  is  accessed  through  the  main  menu  (see  Figure  1)  (enter  2).  The  program 
asks  for  the  maximum  concentration,  units  and  run  number.  Next  the  program  draws 
an  x-y  graph,  sounds  an  alarm  and  waits  for  the  user  to  press  a  key  to  start  input.  The  y- 


axis  is  the  concentration  of  the  constituent  which  is  to  be  simulated.  The  x  axis  is  time, 
in  12  numbered  increments  representing  months. 

When  a  key  is  pressed  the  program  moves  the  cursor  steadily  from  left  to  right.  The 
magnitude  of  the  concentration  is  determined  by  the  Y-axis  positioning  of  the  joystick 
control.  The  rate  of  change  of  concentration  is  determined  by  the  horizontal  speed  of 
the  cursor  and  the  speed  with  which  the  user  changes  the  joystick  position. 

With  a  little  practice  a  wide  range  of  time-concentration  graphs  can  be  easily  generated. 

NOTE:  When  using  an  AT  computer  and/or  when  a  math  co-processor  is 

installed  the  horizontal  cursor  speed  may  be  too  fast.  The  speed  can  be  modified  by 
selecting  1  from  the  main  menu  (change  options)  and  3  from  the  options  menu  (change 
cursor  speed).  To  slow  the  cursor  speed  increase  the  delay  factor. 

It  is  possible  to  enter  and/or  edit  data  sets  manually  using  any  available  word  processor. 

When  365  data  values  have  been  generated  the  program  pauses  for  the  user  to  view  the 
graph. 

If  desired,  a  printout  of  the  graph  on  the  screen  can  be  obtained.  To  accomplish  this 
the  user  must  have  entered  the  DOS  "graphics"  command  prior  to  running  the  program. 
If  you  have  not  done  this  and  wish  to  print  the  screen,  stop  the  program  now.  At  the 
DOS  prompt  t\'pc:  C:\DOS\GRAPHICS  (assuming  your  DOS  programs  are  in  a 
\DOS  director)). 

To  printout  the  graph  type  Shift-Prt  Scr. 

When  generating  artificial  data  the  user  may  not  get  what  is  wanted  on  the  first  iry. 
The  program  allows  the  user  to  redraw  the  graph  as  many  times  as  desired  simply  by 
entering  the  "3"  -  (Draw)  option.  If  the  graph  looks  promising  further  analyses  are 
provided  for. 


option  4  on  the  main  menu  invokes  a  frequency  distribution  analysis  of  the  data  on  the 
screen.  This  provides  a  visual  impression  of  the  type  of  distribution  generated.  Shift 
Prt  Scr  will  print  the  frequency  distribution  if  a  graphics  printer  is  installed.  Output 
from  this  option  is  shown  in  Figure  2. 

Option  5  (Summary  statistics)  calculates  the  mean,  maximum,  minimum,  standard 
deviation,  variance,  range,  skewness  coefficient,  Kurtosis  coefficient,  excess  coefficient 
and  autocorrelation  coefficient  (lag  =  1)  for  the  data  on  the  screen.  Output  from  this 
option  is  shown  in  Figure  3. 

Option  7  redraws  the  graph  on  the  screen. 

If  the  user  is  satisfied  with  the  graph  produced  it  can  now  be  saved  by  selecting  option 
6.  The  program  prompts  for  a  run  #  (0-9  and/or  A-Z)  which  will  be  used  to  name  and 
save  the  data  on  disk. 

5.1.2  Data  Analysis 

At  this  point  it  is  assumed  that  a  suitable  data  set  exists  on  file.  Analysis  programs  are 

available  at  two  levels. 

Option  4  (frequency  distribution),  5  (summary  status)  and  7  (redraw  graph)  which  were 
described  in  the  previous  section  are  still  available  for  use.  If  you  have  not  reviewed 
the  data  for  some  time  or  are  inputing  new  data  (option  3)  it  may  be  desirable  to  do 
these  analyses  first. 

The  main  analysis  programs  are  accessed  through  option  8  on  the  main  menu  (Figure 
1).  This  loads  the  analysis  option  menu  which  produces  the  following  choices. 

1.  Confidence  Limits 

2.  Presence/Absence  Analysis 

3.  Correlogram 

4.  Confidence  Limits  (Full) 

5.  Return  to  Main  Menu 
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FIGURE  2:  OUTPUT  FROM  OPTION  1-4 
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FIGURE  3  SCREEN  OUTPUT  FOR  OPTION  1-5 
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5.1.2.1  Confidence  Limits  Analysis 

Upon  selecting  this  option  the  program  asks  for  the  run  #  (data  set  name).  You  must 
provide  a  valid  entr>'  -  even  if  you  have  been  analyzing  a  data  set  previously. 

The  program  divides  the  365  (daily)  data  set  into  12  subsets  (months)  of  data  consisting 
of  30  days  each.  (The  last  5  days  of  the  year  are  ignored).  For  each  month  the 
following  is  computed:  upper  confidence  limit,  mean,  lower  confidence  limit,  range, 
standard  deviation,  variance,  autocorrelation  coefficient  (lag  =  1),  skewness  coefficient, 
delta  and  frequency. 

This  is  done  for  three  subsets  of  the  month  as  follows: 

1.  daily  (N=30)  assumed  to  be  the  population  (true)  data 

2.  thrice  weekly  (N  =  13),  and 

3.  weekly  (N  =  4) 

The  results  of  analysis  are  reported  in  a  printout.  (See  Figure  4) 

When  executing  the  analysis  the  user  is  given  the  option  of  printing  out  the  raw  data. 

For  each  analysis  run  12  simulations  (i.e.,  months)  are  performed. 

5.1.2.2  Presence/Absence  Analysis 

This  option  (#2)  performs  a  function  similar  to  the  MINITAB  macros.  It  differs  in  the 
way  that  the  binar}'  data  set  is  derived.  Whereas  the  MINITAB  macros  generate  a 
binary  data  set  according  to  a  specified"®"  and  r  this  program  module  uses  data  values 
(i.e.,  concentrations)  and  determines  presence  or  absence  of  a  constituent  based  on  a 
detection  limit  supplied  by  the  user. 

{-e-  is  the  true  proportion  of  detected  values  to  total  number  of  days,  i.e.,  population 
mean  and  r  is  the  autocorrelation  co-efficient  with  a  lag  =  1). 
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nGURE4:  OUTPUT  OF  OPTION  8-1 
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HGURE  4  page  2 
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The  program  samples  the  population  data  set  for  a  variety  of  sampling  scenarios  as 
follows: 

annual 

semi-annual 

quarterly 

bi-monthly 

monthly 

weekly 

thrice  weekly 

daily 

The  following  statistics  are  calculated  for  each  sampling  scenario:  mean,  standard 
deviation,  variance,  number  of  hits,  number  of  misses,  %  of  actual,  FOD  and  FOD 
error. 

The  number  of  hits  is  the  number  of  I's  or  above  detection  limit  values  identified  by 
sampling.  The  %  of  actual  is  the  number  of  hits  as  a  percentage  of  the  true  number  of 
I's.  FOD  is  the  frequency  of  detection  and  represents  the  estimated  percent  of  samples 
that  are  above  detection  based  on  sampling.  FOD  error  is  the  difference  between  the 
estimated  FOD  and  population  FOD.  The  population  statistics  are  provided  in  the 
"daily"  sampling  scenario. 

The  program  offers  the  opportunity  to  repeat  the  analyses  as  often  as  desired  using 
different  user-supplied  detection  limits.  Changing  the  detection  limit  will  change-^. 
Sample  output  using  the  test  data  is  shown  in  Figure  5. 

5.1.2.3  Correlogram 

Option  3  performs  a  correlogram  analysis  of  the  data  set  for  lag  =  1  to  180.  A 
correlogram  is  a  useful  tool  for  investigation  of  periodic  behaviour  in  time  scries  data 
(Loftis,  et.  al.,  1987).  It  is  a  plot  of  the  estimated  autocorrelation  coefficient  for  the 
time  series  data.  Regular  peaks  in  the  autocorrelation  function  usually  indicate 
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HGURE  5:  OUTPUT  FROM  OPTION  8-2 
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periodic  behaviour.  If  periodicity  is  identified  (e.g.,  weekly  cycles)  then  the  data  should 
be  adjusted  to  remove  the  cycles. 

This  program  calculates  autocorrelation  values  and  writes  the  results  to  a  file  named: 

C:\DESIGN\AUTOCOR?.PRN 

(Where  "?"  is  user-supplied) 

This  file  can  be  imported  into  LOTUS  1-2-3  using  the  File-Import  command  sequence. 
LOTUS  1-2-3  can  be  used  to  anah-ze  the  data  further  or  produce  a  plot  of  the 
correlogram.  See  Figure  6  for  an  example  of  a  correlogram. 

5.L2.4  Confidence  Limit  fFull  Analysis') 

Option  4  performs  the  same  analysis  as  Option  1  except  the  sampling  frequency  varies 

from  1  to  14.  See  Figure  7  for  sample  output. 

Option  5  ends  the  analysis  module  and  returns  the  user  to  the  main  menu. 

Data  files  used  or  geiiLrated  by  these  programs  can  easily  be  input  to  other  package 
programs  for  further  analysis.  For  example,  the  data  file  C:\DESIGN\DATA001,  PRN 
could  be  input  to  LOTUS  1-2-3  for  further  analysis.  Similarly,  the  data  set  could  be 
input  to  any  statistical  analysis  program  for  analyses  using  more  sophisticated  statistical 
analysis. 

To  accomplish  this  the  user  should  refer  to  the  documentation  for  the  particular 
analysis  program  to  be  used. 

5.2  MINITAB  MACROS 

As  discussed  previously  the  user  must  own  a  copy  of  the  statistical  analysis  package 
"MINITAB"  to  run  this  part  of  the  model.  The  MINITAB  macros  are  contained  on  the 
distribution  diskette  in  the  files. 
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FIGURE  7:  EXAMPLE  OUTPUT  FROM  OPTION  8-4 
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BINSERO.DAT 
BINSER1.DAT 
BINSER2.DAT 
BINSER3.DAT 
BINSER4.DAT 
BINSER5.DAT 
BINSER6.DAT 

These  macros  consist  of  Vao  parts;  namely,  a  part  to  generate  a  hinar}-  data  set  with 
specified ■©■  and  r  (BINSERO.DAT  to  BINSER2.DAT)  and  a  part  to  analyze  the  data 
set  for  various  sampling  scenarios  (BINSER3.DAT  to  BINSER6.DAT). 

5.2.1  Data  Generation 

To  generate  a  binary  (0/1)  series  (i.e.,  time  series)  having  specified  parameters  type: 

E.XEC  "BINSERO.DAT" 

The  program  prompts  the  user  to  input  the  mean  {-&-)  of  the  series,  i.e.,  the  true  long- 
term  mean  probability  of  state  1  occurring,  the  autocorrelation  coefficient  (r)  with  lag 
=  1  and  the  length  (N)  of  the  scries  to  be  generated. 

To  accomplish  this,  user  must  type: 

let  Kl  =  0.5  (return) 
let  K2  =  0.6  (return) 
let  K3  =  90  (return) 
EXEC  "BINSERl"  (return) 

In  the  above  example, -0-  =  0.5,  r  =  0.6  and  N  =  90. 

The  program  produces  the  specified  binarv  data  set  in  column  CIO. 
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5.2.2  Data  Analysis 

To  analyze  a  binary  data  set  generated  previously,  type: 

"EXEC  "BINSER3.DAT" 

The  program  will  then  print  the  binary  data  set,  plot  presence/absence  versus  N  and 
calculate  mean  and  number  of  runs  of  1. 

Next,  the  I's  are  replaced  by  run  number  and  printed.  A  run  is  a  number  of  I's 
occurring  sequentially  without  interruption. 

Next,  the  autocorrelation  coefficient  (lag  =  1)  and  chi-square  statistic  (1  degree  of 
freedom)  are  calculated  and  printed. 

At  this  point  the  program  prompts  the  user  to  enter  the  sampling  interval  to  investigate. 
To  sample  every  other  day,  for  example,  type: 

LET  K14  =  2 

EXEC  "BINSER5.DAT" 

The  program  now  samples  at  the  specified  frequency  and  calculates  and  prints  the 
number  of  runs  detected.  The  sampling  frequency  can  be  repeated  for  various  intervals 
as  necessary. 

5.2.3  Other  Features 

The  macros  BINSER3.DAT  to  BINSER6.DAT  can  be  used  either  to  analyze  a  data  set 
which  has  been  generated  by  BINSERO.DAT  to  BINSER2.DAT  or  data  from  some 
other  source,  e.g.,  simulated  data  from  the  BASIC  programs  (Section  5.1)  or  real  data. 
In  the  case  of  real  data  it  will  be  necessary  to  translate  the  actual  concentration  values 
into  a  series  of  I's  and  O's  based  on  some  user-specified  criteria  such  as  detection  limit. 
This  must  be  done  manually  or  with  some  other  program  not  currently  available.  The 
results  must  be  stored  in  column  CIO  for  the  MINITAB  macros  to  work. 
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The  analysis  can  then  be  performed  (using  BINSER3.DAT)  as  before  to  calculate  -^ 
and  r  and  then  analyze  for  various  sampling  scenarios. 

Alternatively,  the  user  could  write  his  own  macros  which  could  be  used  to  analyze  the 
data  and  determine-©-  and  r.  Those  values  could  then  be  used  to  generate  a  binary  data 
set,  with  these  characteristics,  for  further  analysis. 

When  running  these  macros  the  user  must  select  an  appropriate  time  frame,  e.g.,  daily 
or  weekly.  If  N=210  is  specified  and  a  time  frame  of  daily  observations  is  assumed  then 
the  programs  would  be  useful  for  examining  sampling  schemes  in  the  range  from  even' 
other  day  to  monthly,  (i.e.,  you  cannot  get  two  semi-annual  samples  in  a  data  base  of 
less  than  a  year).  If  a  time  frame  of  weekly  is  assumed  for  a  value  of  N=210  then 
sampling  schemes  in  the  range  of  weekly  to  semi-annually  can  be  evaluated. 

When  selecting  a  time  frame  the  user  is  cautioned  that  he  must  also  choose-©-  and  r 
values  that  are  consistent  with  the  selected  time  frame.  For  example,  for  daily  samples 
is  the  true  mean  daily  probability  of  presence  and  r  is  the  autocorrelation  coefficient  for 
lag  =  1  day  and  for  weekly  samples  the  appropriate  weekly  statistics  must  be  used. 


•a 
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6.0  PROGRAM  VERIFICATION 

6.1  INTRODUCTION 

The  purpose  of  program  verification  is  to  document  proof  that  the  program  works  as 
intended.  Two  methods  have  been  used  to  perform  verification  checks.  The  first 
method  is  to  perform  all  analysis  by  hand  and  compare  the  results  to  the  program 
output.  The  second  method  is  the  use  of  other  programs  (known  to  function  correctly) 
to  provide  a  check  of  results. 

The  data  set  used  for  verification  of  the  BASIC  programs  was 
C:\DESIGN\DATAOON\PRN  and  is  contained  on  the  distribution  disk. 

6.2  FREOUENCY  DISTRIBUTION 

The  frequency  distribution  program  module  is  accessed  through  Option  #4  on  the  main 
menu  of  the  SIMULATE  program.  Following  is  a  listing  of  the  verification  steps  used 
(i.e.,  Method  1). 

MENU  ITEM  #4  rFREOUENCY  DISTRIBUTION') 

1.  TTie  data  to  be  analyzed  must  be  input  by  means  of  the  Main  Menu  Item  #3  (run 
number  =  N) 

2.  Select  number  of  intervals  =  10 

3.  Determine  Maximum  from  Table  1:  =  154  for  Record  #49 

4.  Determine  Minimum  from  Table  1:  =  0  for  Record  #56 

5.  Range  =  Maximum-Minimum  =  154-0  =  154 

6.  Interval  Width  =  Range/Interval  number  =  154/10  =  15.4  (program  truncates 
result  therefore  Interval  width  =  15). 
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TABLE  1  -  continued 


DATASET   c:\DESIGN\DATAQQn.PRN 

36     32     28     67      0      4     79     46     68     39     32    105     90 
68     92     60    108     62     99     66     80     58    105     52     86     65 
82     62    102     83     UU  99     55    117     80     74    104     45    114     78 

72     98     60    120     81     96    113     78    154    143     78     84     40 
18      56       0      57      46     41      44      94      27      82      60      72     48 
72     84     50    118    112     66    108     70    114     80     97     88     82     87 
74    102     44    109     54    101     66     97     76     78     98     29    106 
94     70    110     54    105     66     74     98     42    101     43     93     52 
66     31     66     64     28     50     20     73     35     66     94     36    100     60 
83     88    102    109     60     82     63     90     66     77     96     56    102 
56    101     37     87     42     92     32     82     63     78     78     41     95 
42    101     42     87     40     87     47     92     62     88     54     60     67     37 
80     34    106      45      96      40      98      47     101      34      89     48     82 
28     66     40     60     54     34     93     34    102     40    104     43     78 
50     62      80     47      76      54    124     107      84    125      95    104      54      35 
60    122     54    106    129    130    135    122     69     70     33      1     50 
Al      ^9    104    104      80      98    117      64    108    136    140    114      70 
a5    116     82     98     72     81    118     62     88    119     38     63    109     83 
66    115     82     40     99    108    121    137    137     68    104     94     62 
74    107     63     45     84     57     66     96     50     76     65     50     74 
88     68     56     78    120     84     88    109     54     50     95    106     59     79 
52     60     79     80     55     75     94     64     8  2     69     63     78     43 
85     67     86    111     65     91     58    107     66     98     60    112     67 
98     68     80     88     64    114     52    108    115     46     85    123    103     41 
114     92     40    118     92     52    126     72     71    124     48    118     59 
83     50     96    108     27     71     84     31     19     27     95     78     46    1 
16    128    129     80     48     91    112     57     40     69    121    137    132     7  i. 
68     40     22     67     64 
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Interval  classes  are  therefore: 
Class  # 


1 

0 

14 

2 

15 

29 

3 

30 

44 

4 

45 

59 

5 

60 

74 

6 

75 

89 

7 

90 

-   104 

8 

105 

-   119 

9 

120 

-   134 

10 

135 

-   154 

(See  Figure  4a  to  verify  screen  output  is  correct). 

Note:    Last  class  is  larger  than  15  to  correct  for  decimal  truncation  in  step  #6. 

Count  absolute  frequency  distribution  occurrence  in  selected  classes  (limits  are 
included  in  classes) 

Number  of 
Class  Values 

1  0-14  4 

2  15-29  11 

3  30-44  40 

4  45-59  49 

5  60-74  73 

6  75-89  66 

7  90      -       104  56 

8  105      -       119  42 

9  120      -       134  16 
10                  135      -       154  8 


365~ 
(Counts  from  Table  1  agree  with  screen  output  in  Figure  4a) 
Note:  (See  Table  1) 
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Compute  Relative  Frequenc)'  by  dividing  absolute  frequency  (each  class)  by  365 
(total  number  of  values)  x  100  (percent) 


Class 

Absolute 

Relative 

Cumulative 

1      0      - 

14 

4       / <365  X 

100  = 

1.10             1.10 

2    15      - 

29 

11 

3.10            4.11 

3    30      - 

44 

40 

10.96           15.07 

4    45      - 

59 

49 

13.42           28.49 

5    60      - 

74 

73 

20.00          48.49 

6    75      - 

89 

66 

18.08          66.58 

7    90      - 

104 

56 

15.34          81.92 

8  105      - 

119 

42 

11.51           93.42 

9  120      - 

134 

16 

4.38          97.81 

10  135      - 

154 

8 

2.19        199.99 

100  (rounded) 

The  above  data  agree  with  the  screen  results  shown  in  Figure  2b  and  2c. 

6.3  DESCRIP'n\T  STATISTICS 

The  descriptive  statistics  program  is  accessed  through  Option  #5  on  the  main  menu  of 
the  SIMULATE  program.  The  screen  output  is  shown  in  Figure  3. 

The  output  for  this  program  was  verified  by  using  the  commercially  available  statistical 
analysis  program  STATPAC  GOLD  version  3.0.  The  same  data  set  was  input  to 
ST  ATP  AC  and  analyzed.  The  results  are  contained  in  Figures  8  and  9.  All  of  the 
statistics  produced  on  Figure  3  agree  with  those  on  Figures  8  and  9.  The  Excess 
Coefficient  is  not  calculated  by  STATPAC.  It  is  simply  (Kurtosis  -  3)  or  (2.5471  -  3)  =  - 
0.4529. 

6.4  CONFIDENCE  INTERVAL  FOR  MEANS 

The  confidence  interval  for  means  program  is  accessed  through  Option  #8  of  the  main 
menu  and  Option  #1  of  the  analysis  menu  (SIMULATE  program).  Figure  4  contains 
the  output  of  the  analysis. 

This  section  of  code  is  used  repeatedly  in  various  parts  of  the  program  to  compute 
statistics  using  data  which  arc  sampled  according  to  different  scenarios. 
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FIGURE  8  USE  OF  STATPAC  GOLD  VERSION  3.0  TO  VERIFY 
"SIMULATE.EXE"  PROGRAM 


liSE  or  STAIFAC  GOLD  VERSIOII  j.m  TO  VESIFY  "SIMULATE. ExE"  PROGRAM 


DESCRIPTIVE  STATISTIC?  FOR  C:\DESIGN\PATAOON.FRN 
toncentrarion 


Minisiiiii!  -    0 

'•■■i'S  z      /"fiji 


■'-■■jcrd  err?!'  f^f  ih?  sean    -  1.jG38 

■"  reice'it  confidenif  interval  around  the  near  ^    'JJc'S  -  ■^?.2c; 

V^nanre  (!jnbia!:^c  -    S25.4SS:' 

•itandard  deviation  (unbiaseijt  ;    2f.^308 

toiRoqiirnv-sifi'riov  '.tatistir  for  nor«aiitv  ^    (^JV' 


fif'^ponst  percent  :  100.0 
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FIGURE  9  USE  OF  STATPAC  GOLD  VERSION  3.0  TO  VERIFY 
"SIMULATE.EXE"  PROGRAM 


iv  VERIf!  'SIMULATE. EXl'  FRO&fiAf! 


'••-JCS  FOR  C:\DFSI6N\DATA00N:. FRN 

Mipie  CorreiatiGn  Analysis 


its  iii  tht  eiid;;iis  "  De?fnptivf  sT^-istus 

N  Mean        St.j.  D?v. 

3c:5  '0.3151  23.001 
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FIGURE  10:  STATPAC  OUTPUT  FOR  HRST  30  RECORDS  OF  TEST  DATA 


StatPdC  Gold  Stdtisticai  Analysis  fackd'SC 

0  E  S  C  R I  f ■  T I  V  E  STATISTICS  FOR  D  A  T  A  00 .  .  D  A  T 
concentration 


M  1  n  1  s  u  E  =0 

M  a  >:  1  E  u  s  =10  8 

Range  =108 

Sub  =   1956 

H  e  a  n  =65.2000 

Median  =   66.5000 

Mode  =   M  u  i  t  i - M  0  d  a  i 

Variance  =   8  01.8933 

Standard  deviation      =   28.31?? 

Standard  error  of  the  lean   =   5.2565 

95  Percent  confidence  interval  around  the  eean  =   S'i.8934  -  ?5.50bfc 

Variance  (unbiased)  =   829.5448 

Standa-i-d  deviation  (unbiased)   =  28.8018 

Skewness      =  -0.4861 

Kurtosis      =   2.62  6? 

K  0  i  B  0  g  0  r  0  V  - S  £  1  r  n  0  V  statistic  for  n  o  r  e  a  i  i  t  y  =   0.5603 

V  a  i  i  d  c  a  5  e  s      =30 
H  i  s  5 1  n  0  cases    =  0 
Response  percent  =  100.0  ^ 
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The  verification  consists  of  two  parts: 

1.  Checking  the  accuracy  of  statistical  calculations,  and 

2.  checking  the  accuracy  of  the  sampling  procedure. 

The  calculations  were  checked  for  month  #1  of  the  printout  and  for  the  "daily" 
sampling  scheme.  Table  1  (Section  6.2)  contains  the  raw  data. 

The  sampling  scenarios  were  derived  manually  and  input  for  analysis  by  STATPAC. 
The  results  are  shown  in  Figures  7  and  8.  All  of  the  statistics  produced  agree  with  those 
on  Figure  6. 

6.5  MTNITAB  MACROS 

Following  is  a  listing  of  the  program  output  from  sample  data  included  on  the  diskette. 
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BINSERX2.LIS 


Thursday,  July  14,  1988 


F'a.ge 


MTB 


e:-;ec  'binser3.dat' 


*****  Entry  -from  keyboard  ***** 


MTB   ;; 

tt 

MTB    :: 

# 

MTB   : 

4t 

MTB     : 

# 

11 

MTB    ; 

# 

of    a 

MTB    ;. 

tt 

MTB    ; 

# 

MTB 

tt 

MTB     : 

# 

MTB   ;; 

tt 

MTB    :; 

iA 

MTB 

W 

#  You  may  be  e;cecuting  this  macro  as  a  -follow-up  to  'BINSER1.DAT'  , 
which  would  have  been  used  to  generate  a  binary  series  with  J 
specified  parameters  and  to  store  it  in  column  CIO,  I-f  so,  you  wi 

interpret  the  following  analysis  and  evaluation  as  a  descr  i  pti  or,  I 

simulated  series  which  has  known  (specified)  parameters.  Thus  tht 
sample  estimates  of  the  parameters  (from  'B1NSER3.DAT')  can  be  * 
compared  to  their  known  values  (the  input  to  'BINSER1.DAT').      l 

tt  Al  te?rnat  i  vel  y ,  the  binary  series  (which  must  be  stored  in  column  CI 

could  be  a  real  monitoring  data  set  where  state  1  represents  ^ 
"detectable"  <:.:>[-     "violation  of  the  standard"  and  state  0  represen' 


■1TB 


"below    detectable"    or     "not     in    violation    of     the    standard' 


If  thi^ 


MTB 

« 

MTB 

-    +1 

d. 

MTB    : 

-    t^ 

MTB 

■    # 

MTB 

•    tt 

MTB 

■      iA- 

I-1TB   ; 

# 

MTB 

# 

M1B     ■ 

■     41. 

MTB 

F' 

the  case,  then  the  following  analysis  and  evaluation  is  for  data 
whose  parameters  af-e  not  known  but  for  which  estimates  arB    desir 

You  may  wish  to  use  these  parameter  estimates  (of  mu  and  rho  ) 
as  input  to   BINBER1.DAT',  to  simulate  binary  series  having  thes 
properties,  and  then  to  run  'BINSER3.DAT'  again  to  evaluate  the  < 
efficiency  of  various  sampling  schemes  for  use  in  sampling  such 

nion  .1  t  on  n g  data. 

il  Now  we  will  analyze  the  binarv  series: 
F'RINT  CIC 


C  ]  C) 


1 

(j 

C) 

0 

1 

1         ] 

1 

1 

0 

0 

1 

0 

0 

1 

0 

1 

0 

0 

0            ] 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1         J 

C) 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

1 

0 

L             1 

1 

0 

1 

0 

1 

0 

0 

0 

0 

1 

0 

1         ] 

L             1 

0 

1 

0 

0 

0 

1 

0 

1 

0 

0 

0 

0 

0 

1 

0 

1 

1 

0 

] 

0 

Ci 

0 

1 

1 

1          ] 

L              1 

1 

1 

0 

c 

0 

1 

0 

0 

0 

0 

1 

c> 

L            1 

0 

1 

1 

1 

0 

(.1 

1 

0 

0 

0 

0         : 

L             1 

1 

0 

1 

1 

0 

(;■ 

1 

I'j 

0 

0 

1         ( 

)            0 

C' 

0 

0 

1 

0 

1 

0 

0 

C) 

0 

0            ( 

)           (I) 

0 

0 

1 

1 

1 

V 

0 

0 

1 

1 

0 

L             0 

<:» 

0 

1 

0 

1 

1 

0 

1 

0 

1 

Ci 

L             1 

1 

1 

1 

0 

1 

1 

1 

(J 

0 

0 

0            ( 

J         1 

C' 

0 

0 

1 

0 

1 

MfB 


TbF'LOT  CIO 


1 . 050+ 


567890 


6  8  C> 


B   1 
-  34  - 


78 


37   01  345  7  9 
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CIO 


0.  700^ 


0. 350+ 


0-000+   234       12  45  7  9  12  4567  90  23456   9012345   89   2    6  8  0 

+ + + + + + + 

0  10         20         30  40         Z'O  60 


1 . 050+ 

4   789  1    5  7   O   3  5  78  0    4567B9';'l    5     01  34  678 
C 1 C) 


0.700+ 


0.350+ 


0.000+  123  56    0  234  6  89  12  4  6   9  123  234  6789   2   5    90 

-I + H H H H H 

60         70         80         90        100         110        120 


1 . 050+ 

-  12      890  23   67     2      8  0  234  6   901  3    7  90 


CIO 


0. 700+ 


0.350+ 
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0.000+    34567    1   45   8901  34567  9  12345678901    5  78    2  456  8 

120         130         140         150         160         170        180 


1 . 050+ 

-  1  3  56  89012  4567   O    4    8  0 
CIO 


<J.(JOO+  2  4   7      3     89  123  56"/ 

180         190        200 

MTB  >  MEAN  CIO 

MEAN  -      'J.  44762 

MTB  >  RUNS  0.5  CIO 

CIO 

K  =  0.  5C'0<I' 


THE  OBSERVED  NO.  OF"  RUNS  =  101 

THE  EXPECTED  NO.  OF  RUNS  -  104.B476 

94  OBSERVATIONS  ABOVE  K   116  BELOW 

THE  TEST  IB  SIGNIFICANT  AT   0.5905 
CANNOT  REJECT  AT  ALPHA  =  0.05 

MTB  >  LET  C13(l)-C10(l) 

MTB  >  LET  K12=l 

MTB  >  LET  K13-2 

MTB  >  LET  K3=C0UNT(C10) 

MTB  y    LET  K8=K3-1 

MTB  >  0H=0 

MTB  >  EXEC  •BIN5ER4.DAT'  K8  TIMES 

MTB  >  LET  C13(K13)=C10<K13)--C10(K12) 

MIB  >  LET  K12==K12+1 

MTB  >  LET  K13=l<13+i 

MTB  >  LET  CJ3(K13)=C10(K13)  -C10(K12) 

MTB  >  LET  K12^K12+1 

MTB  >  LET  K13=K.  13-11 

MTB  >  LET  C13(K13):==C10(K13)-C10(K12) 

MTB  >  LET  K12=-K12+l 


-  36  - 


BINSERX2. LIS  Thursday,  July  14,  1968  Page  4 

MTB  >  LET  K13=K13+1 

1  \ 

;  \   Repeated  e;;ecutions 

j  /      deleted  here 

;  / 

MTB  >  LET  C13(K13)=C10(h;:i3)-C10(K12) 
MTB  >  LET  K12=K12-»-l 
MTB  >  LET  K13=K13+1 

MTB  >  LET  C13(K13)=C10(K13)-C10(K12) 
MTB  >  LET  K12=K12+1 
MTB  >  LET  K13=K13+1 
MTB  >  0H=24 

MTB  >  CODE  (-1)0  C13,C13 
MTB  >  PARSUM  CI 3, CI 4 
MTB  >  LET  C15=C10*C14 
MTB  >  LET  Ki2=MAXI (C15) 
MTB  >  # 

MTB  >  #  The  number  o-}  runs  of  1  is: 
MTB  >  PRINT  K12 
K12       51.0000 
MTB  >  # 
MTB  >  #  Here  is  the  binary  series  again,  with  the  I's  replaced  by  the  run 

MTB  >  #    number  (1st,  2nd,  run  of  I's): 

MTB  >  PRINT  CI 5 


C15 

0 
0 


0 

o 

E! 

(1) 
13 
17 
21 


0 

0 
0 
27 
30 

30 


0 

0 
0 

C' 
40 

40 

o 

44 


C' 

0 

2 

2 

- 

2 

2 

2 

0 

0 

4 

0 

5 

0 

6 

(I) 

0 

•7 

0 

0 

0 

0 

9 

0 

C) 

0 

0 

0 

10 

10 

0 

0 

0 

0 

11 

1  1 

0 

0 

12 

12 

13 

0 

14 

0 

15 

C) 

0 

0 

0 

16 

17 

17 

0 

18 

(I) 

0 

0 

19 

0 

20 

0 

0 

22 

0 

OT, 

0 

24 

24 

0 

25 

26 

26 

26 

26 

26 

26 

26 

26 

0 

0 

0 

0 

0 

28 

28 

0 

29 

29 

0 

30 

0 

31 

31 

0 

0 

0 

0 

0 

32 

32 

33 

0 

0 

34 

34 

0 

0 

0 

0 

35 

0 

0 

36 

0 

37 

0 

0 

c» 

0 

0 

0 

0 

(.') 

38 

38 

38 

0 

39 

0 

Ci 

0 

4] 

0 

0 

0 

42 

0 

43 

43 

43 

45 

45 

0 

46 

46 

46 

46 

46 

C' 

47 
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47 

47     47 

50 

0     5 1 

MTB 

>  ACF  5  CIO 

ACF 

D-f  CIO 

- 1 .  0    ■ 

H 

1 

0.  031 

2 

0.  047 

3 

0.  051 

4 

0.  086 

5 

-0. 025 

48      0      0      0     49 


0.8  -0.6  -0.4  -0.2   0.0   0.2   0.4   0,6   0.8   1.0 

-H 1 H + H 1- H + -i h 

XX 
XX 
XX 
XXX 
XX 

MTB  >  CDF'Y  CIO  C12; 
BUBO   Oil  IT  1:1. 

MTB  >  COPY  CIO  CI  1; 
BUBO   OMIT  l<3:K3. 

MTB  >  CORF^;  Cll  Cl2,Ml 

Cnrrelatjon  of  Cll  and  C12  -  0.031 

MTB  >  COPY  M]  C12-C13 

MTB  >  LET  K9-C13(l) 

MTB  >  LET  Klv^|:;8*K9*K9 

MTB  >  #  Thr-  s.ut  ocorrel  at  i  on  coe+ficiGnt  and  the  chi -square  (1  d-f)  are: 

MTB  >  PRINT  K9  KIO 

K9         0.  0j:i33i:::: 

KIO       0.205163 

MTB  >  # 

MTE<  >    #    Now    wc'    only    obsETve    at    intGjrvalt     (e.g.    the    series    is    daily    and    yc 

MTB  >    ii         .^re'    <samplinq    every    so    many    days). 

MTB  ;•    # 

MTE  >    #    LET    K14    ~    Thc-j    samplinq     (observation)     interval     you    want,     e.g.     3    if 

ou 

MTB  :■■    #    want  every  3  days,  and  then  enter  "EXEC  'BINSER5.DAT'  ". 

MTB  >  let  k  14^^=2  **■»«■**  Entry  from  keyboard  ***** 

MTB  >  e;;ec::  'binser5.dat'        *****  Entry  -from  keyboard  ***** 

MTB  >  tt 

MTB  >    tt    The    siimplinq    interval     is    now: 

MTB  >  PRINT  K14 

K14       2.00000 

MTB  >  LET  K9=RQUND(K3/ia4-t-0.5) 

MTB  >  LET  K15=^K14 1 

MTB  >  SET  CI 2 

MTB  >  END 

MTB  >  STACK  CI 2  1 ,C12 

MTB  >  DH^-0 

MTB  >  EXEC  'BINBER6.DAT'  K9  TIMES 
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MTB  >  STACK  C12  Cll .Cll 

MTB  >  STACK  CAZ:    Cll  ,  Cll 
MTB  >  STACK  CI 2  Cll , Cll 


/ 


Deletion    of    repeated    comiTiandB    here 


MTB  : 
MTB  : 
MTB 
MTB  : 
SUBC: 
MTB  : 
MTB  : 
MTB  : 
MTB  ; 
2)  , 
MTB  : 
MTB  ; 


STACK  CI  2  Cll , Cll 

STACK  CI 2  Cll , Cll 

DH-24 

COPY  Cll  Cll; 

USE  1:K3. 
LET  C12=C15*C11 

NAME  CIS  'SERIES  , Cll  ' OBSERVE ', C12  'DETECTED' 
# 
tt  Following  are    the  generated  series  (col.l),  obEervation  times  (col 


#    and  occurrences  o-f  1  that  arc    detected  (col 
PRINT  C15  Cll  C12 


:) 


ROW   SERIES   OBSERVE   DETECTED 

1         1  0  0 

0 
0 

o 

0 


'■"' 

C' 

1 

3 

0 

0 

4 

0 

1 

5 

2 

0 

6 

2 

1 

7 

2 

0 

8 

2 

1 

9 

2 

0 

10 

2 

1 

11 

0 

0 

12 

0 

1 

13 

3 

0 

14 

0 

1 

15 

0 

0 

16 

4 

1 

17 

C) 

0 

IB 

5 

1 

19 

0 

0 

20 

6 

1 

21 

0 

<;) 

22 

0 

1 

27: 

7 

0 

24 

0 

1 

25 

0 

0 

26 

0 

1 

27 

0 

0 

28 

e 

1 

29 

0 

0 

30 

0 

1 

31 

9 

0 

32 

0 

1 
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y.'^'i 

0 

0 

0 

36 

0 

1 

0 

37 

10 

0 

0 

38 

10 

1 

10 

39 

0 

0 

0 

40 

0 

1 

0 

41 

0 

0 

0 

42 

0 

1 

0 

43 

0 

0 

0 

44 

0 

1 

0 

45 

0 

0 

0 

46 

1 1 

1 

11 

47 

1 1 

0 

0 

48 

0 

1 

0 

49 

C) 

0 

0 

50 

12 

1 

12 

51 

12 

0 

0 

52 

0 

1 

0 

53 

J.  o 

0 

0 

54 

13 

1 

13 

55 

13 

0 

0 

56 

(J 

1 

0 

57 

14 

0 

0 

58 

0 

1 

0 

59 

15 

0 

0 

60 

0 

1 

0 

61 

C) 

0 

0 

62 

0 

1 

0 

63 

0 

0 

0 

64 

16 

1 

16 

65 

0 

0 

0 

66 

0 

1 

0 

67 

17 

0 

0 

68 

17 

1 

17 

69 

17 

0 

0 

70 

(1 

1 

0 

71 

IS 

0 

0 

72 

0 

1 

0 

73 

0 

0 

0 

74 

0 

1 

0 

75 

19 

0 

0 

76 

0 

1 

0 

77 

20 

0 

0 

7S 

0 

1 

0 

79 

0 

0 

0 

80 

"?  1 

1 

21 

81 

0 

0 

0 

82 

0 

1 

0 

83 

22 

0 

0 

84 

0 

1 

0 

85 

23 

0 

0 

86 
87 

1.') 
24 

1 

0 

0 
0 

88 

24 

1 

24 
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89 

0        c 

!               0 

90 

25          J 

25 

91 

0       <: 

)          0 

92 

0          ] 

0 

93 

0       <: 

)          0 

94 

26          1 

26 

95 

26         C 

)          0 

96 

26          1 

26 

97 

26      <; 

)           0 

98 

26          1 

26 

99 

26        <: 

)          0 

100 

26          1 

26 

101 

26        <: 

)          0 

1 02 

0          1 

0 

103 

0       c 

)          0 

104 

0         1 

0 

105 

27         «. 

)          0 

106 

0          1 

0 

1 07 

0         C 

>          0 

108 

0         1 

0 

109 

0         ( 

)          0 

1  1 0 

28         1 

28 

111 

28         ( 

)          0 

112 

0 

0 

113 

29         ( 

0 

114 

29          ] 

29 

115 

0                       ' 

0 

116 

30          J 

30 

117 

30                       I 

)          0 

118 

30         i 

30 

119 

0                       < 

0 

120 

0          1 

0 

121 

31          < 

)          0 

122 

31          ] 

31 

123 

0                        i 

)          <I» 

124 

0         J 

L           0 

125 

0                        ' 

■I                         0 

126 

0 

[                          0 

127 

0                      < 

)          0 

120 

32          ] 

I         32 

129 

32         ( 

!>           0 

1 30 

32 

I         32 

131 

0         < 

1)          0 

132 

33 

I         33 

133 

33           ( 

»          0 

134 

0 

I          0 

135 

0         ( 

J                        0 

136 

34 

I          34 

137 

34          f 

:>       0 

138 

0 

I           0 

13- 

0                       < 

:-       0 

140 

0 

1           C' 

141 

0         ( 

:>       0 

142 

35 

I         35 

143 

0         ( 

:>       0 

Thursday,  July  14,  1988 
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Thursday 

144 

0 

1 

0 

145 

0 

0 

0 

146 

0 

1 

0 

147 

0 

0 

0 

148 

36 

1 

36 

149 

0 

0 

0 

150 

37 

1 

37 

151 

0 

0 

0 

152 

0 

1 

0 

153 

0 

0 

0 

154 

0 

1 

0 

155 

0 

0 

0 

156 

0 

1 

0 

157 

0 

0 

0 

158 

0 

1 

0 

159 

0 

0 

0 

160 

0 

1 

0 

161 

0 

0 

0 

162 

38 

1 

38 

163 

38 

0 

0 

164 

38 

1 

38 

165 

0 

0 

0 

166 

39 

1 

39 

167 

0 

C) 

0 

168 

0 

1 

0 

169 

40 

l"l 

0 

170 

40 

1 

40 

171 

40 

0 

0 

172 

C) 

1 

0 

173 

41 

0 

0 

174 

0 

1 

0 

175 

0 

0 

0 

176 

0 

1 

0 

177 

42 

0 

0 

178 

0 

1 

0 

179 

43 

c> 

0 

180 

43 

1 

43 

181 

43 

0 

0 

182 

0 

1 

0 

183 

44 

0 

0 

184 

0 

1 

0 

185 

45 

0 

0 

186 

45 

1 

45 

187 

0 

C) 

0 

188 

46 

1 

46 

189 

46 

0 

0 

190 

46 

1 

46 

191 

46 

C) 

0 

192 

46 

1 

46 

193 

0 

(I) 

0 

194 

47 

1 

47 

195 

47 

0 

0 

196 

47 

1 

47 

197 

47 

0 

0 

198 

0 

1 

0 
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199  0  O  0 

200  48  1  48 

201  O  O  0 

202  0  1  0 

203  0  O  0 

204  49  1  49 

205  0  0  0 

206  0  1  O 
2(J7  0         O  0 

208  50  1  50 

209  0  0  O 

210  51  1  51 

MTB  >  LET  C16=Cli*C15 

MTB  >  # 

MTB  >  #  The  number  oi  runs  oi     1  that  were  missed  bv  a  given  sampling  scheme 

can 

MTB  >  #    be  calculated  as  the  number  o+  runs  c-f  l  minus  the  number  of  non- 
zero 

MTB  >  #    categories  listed  in  the  -following  TALLYs  of  the  data. 

MTB  >  TALLY  C16 

CI  6   COUNT  CI 6   COUNT 
O    163 


6 

8 

1 
1 

10 

1 

11 

1 

12 

1 

13 

J 

16 

] 

17 

1 

21 

1 

24 

1 

25 

1 

26 

4 

28 

1 

29 

1 

30 

2 

31 

1 

-TO 

o 

34 

1 

35 

1 

36 

1 

37 

1 

38 

2 

39 

1 

40 

1 

43 

1 

45 

1 

46 

3 

47 

2 

40 

1 

49 

1 

50 

1 

51 

1 

N= 

210 

33      1 

MTB  >  # 

MTB  >  tt  You  can  change  the  value  stored  in  K14  and  "EXEC  'BINSER5.DAT   "  ag 
ai  n . 
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DECLARE  SUB  AN3  () 

DECLARE  SUB  Script  () 

DECLARE  SUB  REA0IN3  (arraydat ! ( ) ) 

DECLARE  SUB  PR0C2  (arraydat !( ),  samplenum%) 

DECLARE  SUB  CORRELOGRAM  () 

DECLARE  SUB  An2  () 

DECLARE  SUB  readin2  (arraydat !( )) 

DECLARE  SUB  Ani  () 

DECLARE  SUB  READIN  (arraydat !( )) 

DECLARE  SUB  Thrice  (arraydat !( ),  matrix!  (),  sanpleniinX) 

DECLARE  SUB  Prod  (matrix!  (),  san^slenumX) 

DECLARE  SUB  Sample  (arraydat !( ),  matrix! (),  samplenum%) 

DECLARE  SUB  Options2  (analsSO) 

DECLARE  SUB  Draw2  () 

DECLARE  SUB  Fileload  (arraydat !( )) 

DECLARE  SUB  Screensave  () 

DECLARE  SUB  Refresh  () 

DECLARE  SUB  Options  (OPTS$()) 

DECLARE  SUB  Printfile  (arraydatO) 

DECLARE  SUB  waiter  () 

DECLARE  SUB  Ender  () 

DECLARE  SUB  Drawl  () 

DECLARE  SUB  Bargraph  () 

DECLARE  SUB  FRAME  (LEFTCOL%,  RIGHTCOL%,  TOPROWX,  BOTTOMROWX) 

DECLARE  SUB  Menu  (MENUCHOICESSO ,  NUMcHOsen%) 

DECLARE  SUB  Sunmarize  () 


SIMULATE.EXE 
J.E.   O'Neill 
SEPTEMBER  3,    1988. 


OPTION  BASE  1 

COMMON  SHARED  X,  Nx,  n,  Direct,  arraydatO,  xbar,  xsd,  Xvar,  R,  Flagi 

COMMON  SHARED  Flag3,  Flagjk,  DELAY,  MIN,  max,  simlength,  Flagfile,  Max. cone 

COMMON  SHARED  LPS,  MONTH,  xc,  rn$,  MU,  matrixO,  FREQO,  excess,  V,  FLAG6 

COMMON  SHARED  Hit,  Missed,  Samplecount,  Numi ,  Vail,  Presi,  skew,  Kurt 

ON  ERROR  GOTO  HANDLER 

SCREEN  0  '  ***FOR  CGA  6«0  BY  200 

DIM  arraydat(366)  '  ***HOLDS  ONE  YEAR'S  DATA 

DIM  MEN$(9)  '  ***MAIN  MENU  ITEMS 

DIM  FREO(IO) 

DIM  matrix(31) 

DIM  SHARED  t975(30) 

t975(1)  =  12.71 

t975(2)  =  4.3 

t975(3)  =  3.18 

t975(4)  =  2.78 

t975(5)  =  2.57 

t975(6)  =  2.45 

t975(7)  -   2.36 

t975(8)  =  2.31 

t975(9)  =  2.26 

t975(10)  =  2.23 

t975(11)  =  2.2  -  45 


t975(12)  =  2.18 

t975(13)  =  2.16 

t975(14)  =  2.U 

1975(15)  =  2.13 

t975(16)  =  2.12 

t975(17)  =  2.11 

t975(18)  =  2.1 

t975(19)  =  2.09 

t975(20)  =  2.09 

t975(21)  =  2.08 

t975(22)  =  2.07 

t975(23)  =  2.07 

t975(24)  =  2.06 

t975(25)  =  2.06 

t975(26)  =  2.06 

t975(27)  =  2.05 

t975(28)  =  2.05 

t975(29)  =  2.04 

t975(30)  =  2.04 

true%  =  -1 

false%  =  0 

FOR  1%  =  1  TO  9  '  ***READ  IN  MAIN  MENU  ITEMS 

READ  MEN$(I%) 
NEXT  1% 

DATA    1  CHANGE  DEFAULT  SETTINGS 
DATA    2  DRAW  POLLUTOGRAPH 
DATA    3  LOAD  DATASET 

DATA    4  CALCULATE  FREQUENCY  DISTRIBUTION 
DATA    5  CALCULATE  SUMMARY  STATISTICS 
DATA    6  URITE  DATA  TO  FILE 
DATA    7  REDRAW  POLLUTOGRAPH 
DATA    8  ANALYSIS  PROGRAMS 
DATA    9  QUIT 
'  ==========0  P  T  I  0  N  S  ========================================== 

DIM  0PTS$(4)  '  ***OPTIONS  MENU 

FOR  1%  =  1  TO  4 

READ  OPTS$(I%) 

NEXT  1% 
DATA  1  INITIALIZE  PRINTER  SETTINGS 
DATA  2  SET  TINY  PRINT 
DATA  3  DRAW  DELAY  FACTOR 
DATA  4  RETURN  TO  MAIN  MENU 

DIM  anals$(5) 

FOR  1%  =  1  TO  5 

READ  anals$(IX) 

NEXT  1% 
DATA  1  CONFIDENCE  INTERVAL  FOR  MEANS 
DATA  2  PRESENCE/ABSENCE  OF  CONSTITUENTS 
DATA  3  CORRELOGRAM 
DATA  4  CONFIDENCE  INTERVAL  (FULL) 
DATA  5  RETURN  TO  MAIN  MENU 

0K%  =  false% 

Flagi  =  0:  Flag3  =  0 

L00P1: 

CALL  Menu(MEN$(),  CHOICE'/.) 

IF  CHOICE%  =  1  THEN  CALL  Opt i ons(OPTS$( ) ) 

IF  (CHOICE%  =  2)  THEN 

Flagi  =  1  _  45  _ 


CALL  Drawl 
END  IF 

IF  (CHOICEX  =  3)  THEN 

CALL  FiletoadCarraydatO) 

Flagi  =  1 
END  IF 


IF  ((CHOICEX  =  4)  AND  (Flagi  =  0))  THEN 

CLS 

LOCATE  10,  20 

PRINT  "YCXJ  MUST  DRAW  A  GRAPH  BEFORE  YOU  CAN  ANALYZE  IT." 

LOCATE  11,  20 

PRINT  "CHOSE  2  OR  8(END)  ON  MAIN  HENU" 

CALL  FRAMEdS,  70,  8,  13) 

BEEP 

CALL  waiter 

GOTO  L0OP1 
END  IF 

IF  ((CHOICE%  =  4)  AND  (Flagi  =  1))  THEN  CALL  Bargraph 

IF  ((CH0ICE%  =  5)  AND  (Flagi  =  0))  THEN 

CLS 

LOCATE  10,  20 

PRINT  "YOU  MUST  DRAW  A  GRAPH  BEFORE  YOU  CAN  ANALYZE  IT." 

LOCATE  11,  20 

PRINT  "CHOSE  2  OR  SCEND)  ON  MAIN  MENU" 

CALL  FRAMEdS,  70,  8,  13) 

BEEP 

CALL  waiter 

GOTO  L00P1 
END  IF 

IF  ((CHOICEX  =  5)  AND  (Flagi  =  1))  THEN 

Flag3  =  1 

CALL  Sunmarize 
END  IF 

IF  {{CHOICEX  =  5)  AND  (Flagi  =  0))  THEN 

CLS 

LOCATE  10,  20 

PRINT  "YOU  MUST  DRAW  A  GRAPH  THEN  ANALYZE  IT  BEFORE" 

LOCATE  11,  20 

PRINT  "YOU  CAN  SAVE  IT." 

LOCATE  12,  20 

PRINT  "CHOSE  2  OR  8  (END)  ON  MAIN  MENU" 

CALL  FRAMEdS,  70,  8,  13) 

BEEP 

CALL  waiter 
GOTO  L00P1 
END  IF 

IF  ((CHOICEX  =  6)  AND  (Flagi  =  1)  AND  (Flag3  =  0))  THEN 
CLS 

LOCATE  10,  20 

PRINT  "YOU  MUST  ANALYZE  THE  OATASET  BEFORE  YOU  CAN  SAVE  IT" 
LOCATE  11,  20 

PRINT  "DATA  SET  WILL  BE  ANALYZED  NOW" 
LOCATE  12,  20 
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PRINT    "TO  SAVE   DATASET    CHOSE   #6   FROM   THE   MENU." 
CALL    FRAMEdS,    72,   8,    H) 
BEEP 

CALL  waiter 
CALL  Summarize 
Flag3  =  1 
GOTO   L00P1 
END    IF 

IF    ((CHOICEX  =  6)   AND    (Flagi    =   1)   AND   (Flag3  =   D)   THEN 

CALL   Printf ile(arraydat()) 
END    IF 

IF   (CHOICE%  =  7)  THEN  CALL  Refresh 

IF   CHOICEX  =  8  THEN   CALL  Opt ions2(anals$( )) 

IF   CHOICEX  =  9  THEN   CALL   Ender 
GOTO  L00P1 
END 

REM  ERROR        HANDLING       ROUTINES 

HANDLER: 

SELECT  CASE  ERR 
CASE  6 
BEEP 

PRINT  "DIVISION  BY  ZERO  ...  CASE  SKIPPED" 
LPRINT  "DIVISION  BY  ZERO  ...  CASE  SKIPPED" 
RESUME  NEXT 
CASE  27 
BEEP 

PRINT  "  PRINTER  OUT  OF  PAPER. . .PRESS  A  KEY  WHEN  READY" 
AAA$  =  "" 
WHILE  AAA$  =  "" 
AAA$  =  INKEYS 
WEND 

PRINT  "RETRYING  PRINT  REQUEST..." 
RESUME 

CASE  68 

BEEP 

PRINT  "PRINTER  NOT  READY... PUT  PRINTER  ON  LINE  AND  PRESS  A  KEY" 

AAA$  =  "" 

WHILE  AAA$  =  "" 

AAA$  =  INKEYS 

WEND 

PRINT  "RETRYING  PRINT  REQUEST..." 

RESUME 

CASE  25 

BEEP 

PRINT  "PRINTER  NOT  READY... PUT  PRINTER  ON  LINE  AND  PRESS  A  KEY" 

AAA$  =  "" 

WHILE  AAA$  =  "" 

AAA$  =  INKEYS 

WEND 

PRINT  "RETRYING  PRINT  REQUEST..." 

RESUME 

CASE  53 

BEEP 

PRINT  "  FILE  NOT  FOUND." 

FOR  X  =  1  TO  150:  NEXT  x  -  48  - 


RESUME  L00P1 

CASE  ELSE 

PRINT  "UNANTICIPATED  ERROR  ENCOUNTERED  CODE  ",  ERR 

END 
END  SELECT 
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 


SUB  Ani 

INPUT  "ENTER  RUN  NUMBER  ============>";  rn$ 

LPRINT  CHR$(27);  CHR$(15); 

LPRINT  CHR$(27);  CHR$(9); 

LPRINT  "*  *  *  *CONFIDENCE  INTERVAL  ANALYSIS  *  *  *  *  " 

LPRINT  "      GARTNER  LEE  LIMITED" 

LPRINT  "SIMULATION  RUN  NUMBER  ";  rn$ 

LPRINT  DATES 

LPRINT  TIMES 

LPRINT  "STATISTICS  CALCULATED  USING  POPULATION  MEAN  (Mu)" 

timestart  =  TIMER 

CLS  :  PRINT  "PROGRAM  RUNOATE  -  ";  DATES;  ",  RUN  TIME  -  ";  TIMES;  "." 

CALL  READINCarraydatO): 

WIDTH  LPRINT  132 

FOR  MONTH  =  1  TO  12 

IF  MONTH  =  7  THEN  '  SET  TO  7  FOR  COMPRESSED  PRINT 

BEEP  '        13  FOR  TINY  PRINT 

LPRINT  CHR$(12); 

aaS  =  "" 

PRINT  "CHANGE  PAPER  AND  PRESS  A  KEY  WHEN  READY" 

WHILE  aaS  =  "" 

aaS  =  IN<EYS 

WEND 
END  IF 
FLAG6  =  1 

PRINT  "MONTH  ",  MONTH 

LPRINT  "" 
LPRINT  "MONTH  ",  MONTH 

LPRINT  "  FR   N        UCL      MEAN       LCL      RANGE      SD        VAR        AC       SKEW     EXCESS    DELTA 


LPRINT 


LPS  =  1 

CALL  Sanple(arraydat(),   tnatrixO,    samplerHjii%) 

CALL   Proc1(matrix( ),    sanplenumX) 

CALL  Thrice(arraydat( ),   matrixO,    samplenanX) 
CALL  Proc1(matrix(),    samplenanX) 

LPS  =  7 

CALL  Sample(arraydat( ),  matrixO,  sampIenunX) 

CALL  Proc1(inatrix( ),  san^>lenaTiX) 
NEXT  MONTH 

PRINT  "":  PRINT  "NORMAL  TERMINATION  AT   ";  TIMES 
timeend  =  TIMER 
LPRINT  "" 

LPRINT  "Conputat ion  time  ="; 

LPRINT  USING  "  ###.#  ";  (timeend  -  timestart)  /  60; 
LPRINT  "  minutes." 
LPRINT  CHR$(12); 

END  SUB 
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SUB  AN3 

INPUT  "ENTER  RUN  NUMBER  ============>";  rn$ 

LPRINT  CHR$(27);  CHR$(15); 

LPRINT  CHR$(27);  CHR$(9); 

LPRINT  "*  *  *  *CONFIDENCE  INTERVAL  ANALYSIS  *  *  *  *  " 

LPRINT    "  FULL  ANALYSIS" 

LPRINT  "      GARTNER  LEE  LIMITED" 

LPRINT  "SIMULATION  RUN  NUMBER  ";  rn$ 

LPRINT  DATES 

LPRINT  TIMES 

timestart  =  TIMER 

CLS  :  PRINT  "PROGRAM  RUNDATE  -  ";  DATES;  ",  RUN  TIME  -  ";  TIMES;  "." 

CALL  READINCarraydatO): 

WIDTH  LPRINT  132 

INPUT  "ENTER  MONTH  TO  ANALYZE  (1  TO  12)  ";  MONTH 

IF  MONTH  =  13  THEN  '  SET  TO  7  FOR  COMPRESSED  PRINT 

BEEP  '        13  FOR  TINY  PRINT 

LPRINT  CHR$(12); 

aaS  =  "" 

PRINT  "CHANGE  PAPER  AND  PRESS  A  KEY  WHEN  READY" 

WHILE  aaS  =  "" 

aaS  =  INKEYS 

WEND 
END  IF 
FOR  FLAG6  =  1  TO  2 

LPRINT  "" 

IF  FLAG6  =  1  THEN 

LPRINT  "STATISTICS  CALCULATED  USING  POPULATION  MEAN  (Mu)  FOR  MONTH  ";  MONT 

END  IF 

IF  FLAG6  =  2  THEN 

LPRINT  "STATISTICS  CALCULATED  USING  SAMPLE  MEAN  (Xbar)  FOR  MONTH  ";  MONTH 

END  IF 

LPRINT  " 

LPRINT  "  FR   N        UCL      MEAN       LCL      RANGE      SD        VAR 

LPR I  NT  " 

LPRINT  "" 

FOR  LPS  =  1  TO  15 

CALL  Sample(arraydat( ),  matrixO,  samplenum%) 
CALL  ProcUmatrixC ),  samplenuni%) 
IF  LPS  =  1  THEN 

LPRINT  " 

END  IF 

NEXT  LPS 

CALL  Thrice(arraydat( ),  matrixO,  satnplenumX) 
CALL  Proc1(matrix( ),  samplenumX) 

NEXT  FLAG6 

PRINT  "":  PRINT  "NORMAL  TERMINATION  AT   ";  TIMES 

timeend  =  TIMER 

LPRINT  "" 

LPRINT  "Co(nputation  time  ="; 

LPRINT  USING  "  ###.#  ";  (timeend  -  timestart)  /  60; 

LPRINT  "  minutes." 

LPRINT  CHRS(12); 
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FOR  X  =  1  TO  5 

PRINT  "  " 
NEXT  X 

LPRINT  "*  •  *  •PRESENCE/ABSENCE  ANALYSIS  PROGRAM*  *  •  *  " 
LPRINT  "       GARTNER  LEE  LIMITED" 
LPRINT  "       RUN  DATE  ";  DATES 
LPRINT  "       TIME  ";  TIMES 
LPRINT  "  " 
PRINT  "  WORKING  " 

FREQ(I)  =  365     '*******SPECIFY  SAMPLING  INTERVALS******** 

FREQ(2)  =  180 

FREQ(3)  =  90 

FRE0(4)  =  60 

FREQ(5)  =  30 

FRE0(6)  =  7 

FREQ(7)  =  2 

FRE0(8)  =  1 

RENTER1: 

CALL  readin2(arraydat())  '****REAO  IN  DATA  AND  COMPUTE  PRES/ABSENCE*** 

LPRINT  "  " 

LPRINT  "SAMPLES  HIT   MISSED  %   ACTUAL     FOO       FOO-ERROR" 

LPRINT  " " 

fOR  I  =  1  TO  8 

Sanplecount  =  INT(365  /  FREQ(I)) 
n  =  1:  NHIT  =  0:  Missed  =  0:  Hit  =  0 
FOR  z  =  1  TO  365 

pres  =  arraydat(z) 
NHIT  =  0 

IF  n  MOO  FREQ(I)  =  0  THEN  NHIT  =  pres 
IF  n  MOO  FREQ(I)  <>  0  THEN  NHIT  =  -9 
IF  pres  =  1  AND  NHIT  =  -9  THEN  Missed  =  Missed  +  1 
IF  pres  =  1  AND  NHIT  =  1  THEN  Hit  =  Hit  +  1 
n  =  n  ♦  1 
NEXT  z 
CLOSE 

IF  Nunl  =  0  THEN  Vail  =  0 

IF  Nunl  >  0  THEN  Vail  =  (Hit  /  Numi )  •  100 

FOO  =  (Hit  /  Sanplecount)  *  100 

PREERROR  =  FOO  -  (Nunl  /  365)  *  100 

LPRINT  USING  "  ##«  ";  Somptecount,  Hit,  Missed; 

LPRINT  USING  "  #)«(.###  ";  Vail;  FOO;  PREERROR 

NEXT  I 

LPRINT  "  " 

INPUT  "CHANGE  DETECTION  LIMIT  (Y/N)7";  ANSS 

IF  ANSS  =  "y"  THEN  GOTO  RENTER1 

IF  ANSS  =  "Y"  THEN  GOTO  RENTER1 

GOTO  ENDR 

IF  ANSS  :  CHRS(121)  THEN  GOTO  RENTER1 

ENDR: 

LPRINT  CHRS(12); 

LPRINT  "        FINISHED" 
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SUB  Bargraph 

'=r=====z==============================sCREEN  BAR  GRAPH  =============r 

CLS 

SCREEN  0 

pst: 

INPUT  "ENTER  NUMBER  OF  CLASS  INTERVALS  (5  TO  20)  ===>  ",  int.num 

IF  int.num  >  20  OR  int.num  <  5  THEN  GOTO  pst 

LOCATE  1,  65 

PRINT  "PLEASE  WAIT..."; 

LOCATE  2,  20 

DIM  bvaU20) 

MIN  =  999999 

max  =  0 

CLS 

FOR  1  =  1  TO  int.num 

bvaUI)  =  0 
NEXT  I 
FOR  z  =  1  TO  365 

XX  =  arraydat(z) 
IF  XX  >  max  THEN  max  =  xx 
IF  XX  <  MIN  THEN  MIN  =  XX 
NEXT  2 

Range  =  max  -  MIN 
int. size  =  CI  NT (Range  /  int.num) 

PRINT  "MIN  MAX  RANGE      INTERVAL  SIZE" 

PRINT  MIN,  max.  Range,  int. size 
FOR  I  =  1  TO  365 

FOR  a  =  1  TO  int.num 

temp.val  =  (MIN  +  (a  -  1)  *  int. size) 

SELECT  CASE  a 

CASE  IS  <  int.num 

IF  arraydat(I)  >=  temp.val  AND  arraydat(l)  <  (temp.val  +  int. size)  THEN 

bval(a)  =  bval(a)  +  1 
END  IF 

CASE  IS  =  int.num 
IF  arraydat(I)  >=  temp.val  AND  arraydat(I)  <=  max  THEN 

bvaUa)  =  bval(a)  +  1 
END  IF 
END  SELECT 

NEXT  a 
NEXT  I 

SCALE  1  =  U:   SCALE2  =  60:  SCALE3  =  60 

DSCAG: 

SCREEN  0:  CLS  :  symbol  =  177 

FOR  a  =  1  TO  int.num 

LOCATE  A  +  a,  30:  PRINT  STRING$(bval(a)  /  SCALE1,  symbol);  bvaUa); 
NEXT  a 

LOCATE  3,  10:  PRINT  "       CLASS"; 
FOR  a  =  1  TO  int.num 

temp.val  =  (MIN  +  (a  -  1)  *  int. size) 

LOCATE  a  *  U,    10:  PRINT  USING  "UmUmt";    temp.val; 

PRINT  "   -"; 

SELECT  CASE  a 

CASE  IS  <  int.num 

PRINT  USING  "tmt**";    temp.val  +  int. size  -  1 

CASE  IS  =  int.num 

PRINT  USING  "#####";  max  -  52  - 


END  SELECT 
NEXT  a 


LOCATE  24,  10 


PRINT    "ABSOLUTE        FREQUENCY     OISTRIBUTIO  N"; 

CALL  uatter 
SCREEN  0:   CLS   :     symbol   =   177 
Ciincount   =  0 

LOCATE  2,  60:  PRINT  "CUMULATIVE" 
LOCATE  3,  60:  PRINT  "FREQUENCY  " 
FOR  a  =   1   TO   int. nun 

LOCATE  4  ♦  a,    30:    PRINT   STRING$((bvaUa)   /  365  *   SCALE2),    symbol); 

PRINT  USING  "###.##";   bvaUa)  /  3.65; 

Cuncount   =  Cuncount   +   bvaUa)   /  3.65 

LOCATE  4  +   a,   60:    PRINT  USING  "###.##";   Cuncount; 
NEXT   a 

LOCATE   3,    10:    PRINT    "  CLASS"; 

FOR  a  =   1    TO   int.nim 

tenp.val   =  (MIN  +   (a  -    1)  *   int. size) 

LOCATE   a  +  4,    10:    PRINT  USING  "######";    temp.val; 

PRINT   " 

SELECT  CASE  a 

CASE   IS  <   int.num 

PRINT  USING  "#####";    temp.val   +   int. size  -    1 

CASE    IS  =    int.num 

PRINT   USING   "(»####";    max 

END   SELECT 
NEXT   a 

LOCATE   24,    10 

PRINT   "RELATIVE      FREQUENCY      OISTRIBUTIO  N"; 

CALL  waiter 
SCREEN  0:   CLS   :      symbol   =   177 
CLmcount   =  0 

LOCATE  3,    10:   PRINT   "  CLASS"; 

LOCATE   2,    60:    PRINT    "CUMULATIVE":    LOCATE    3,    60:    PRINT    "FREQUENCY" 
FOR   a  =   1    TO   int. nun 

Cuncount   =   Cuncount   ♦   bval(a)   /   3.65 

LOCATE   4   ♦   a,    30:    PRINT   STR1NG»( (Cuncount   /  365   •   SCALE3),    symbol); 

LOCATE  4  ♦  8,   60:   PRINT  USING  "###.»#";   Cuncount; 
NEXT  a 
FOR  a  =    1    TO   int. nun 

tenp.val    =   (MIN  ♦  (a  -    1)  •   int. size) 

LOCATE   a  *  4,    10:   PRINT  USING  "#####«";    temp.val; 

PRINT    "        -"; 

SELECT    CASE   a 

CASE    IS    <    int. nun 

PRINT   USING  "##«#«";    tenp.val    *    int. size   •    1 

CASE    IS   «    int. nun 

PRINT  USING  "mnmH";    max 

END  SELECT 
NEXT  a 

LOCATE  24,  10 


PRINT  "CUMULATIVE   FREQUENCY  OISTRlBUTI-0  M5t3  - 


CALL  waiter 
CLS  :  LOCATE  10,  10: 

INPUT  "CHANGE  SCALING  FACTORS?  (Y/N)  ";  ANS$ 

IF  ANS$  =  "Y"  THEN  GOTO  OKI 

IF  ANS$  =  "y"  THEN  GOTO  0<1 
GOTO  NOTOK 
OKI: 

PRINT  "SCALE1  (ABSOLUTE)  SET  AT  ";  SCALE1 

INPUT  "ENTER  NEW  VALUE...";  SCALE1 

PRINT  "SCALE2  (RELATIVE)  SET  AT  ";  SCALE2 

INPUT  "ENTER  NEW  VALUE...";  SCALE2 

PRINT  "SCALE3  (CUMULATIVE)  SET  AT  ";  SCALE3 

INPUT  "ENTER  NEW  VALUE...";  SCALE3 
GOTO  DSCAG 
NOTOK: 
END  SUB 

SUB  CORRELOGRAM 

CLS 

PRINT  "CORRELOGRAM  PROGRAM" 

STARTTIME  =  TIMER 

INPUT  "ENTER  RUN  NUMBER  ============>";  rn$ 

CALL  READIN3(arraydat()): 

LPS  =  1 

CALL  PR0C2(arraydat(),  samplenLim%) 
LPRINT  CHR$(12); 
ENDTIME  =  TIMER 

LPRINT  "COMPUTATION  TIME  =  ";  (ENDTIME  -  STARTTIME)  /  60 
END  SUB 

SUB  Drawl 

CLS 

Flag3  =  0 

SCREEN  2 

INPUT  "ENTER  UPPER  CONCENTRATION  LIMIT===>",  Max. cone 

INPUT  "ENTER  UNITS  ======================>",  UNITSS 

INPUT  "ENTER  RUN  #  ";  rn$ 

Yscale  =  (Max. cone  /  8A) 

CLS 

simlength  =  365 

LOCATE  2,  65 

LINE  (100,  180)-(500,  180)       'X-AXIS  ************************** 

LINE  (100,  10)-(100,  180)       'Y-AXIS  **********•••********•**•• 

FOR  q  =  1  TO  12 

LINE  (100  +  q  *  30,  180)-(100  +  q  *  30,  175) 

NEXT  'HORIZONTAL  TICKS**************** 

>=======:==========   LABEL    X-AXIS  ====================== 


LOCATE  24,  14:  PRINT  1; 

LOCATE  24,  22:  PRINT  3; 

LOCATE  24,  29:  PRINT  5; 

LOCATE  24,  37:  PRINT  7; 

LOCATE  24,  44:  PRINT  9; 


LOCATE  24,  18:  PRINT  2 

LOCATE  24,  25:  PRINT  4 

LOCATE  24,  33:  PRINT  6 

LOCATE  24,  40:  PRINT  8 

LOCATE  24,  47:  PRINT  10 


LOCATE  24,  51:  PRINT  11;  :  LOCATE  24,  55:  PRINT  12; 

FOR  q  =  1  TO  10 

LINE  (95,  180  -  (16.4  *  q))-(465,  180  -  (16.4  *  q)) 
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LOCATE  3,  7:  PRINT  Max. cone 
LOCATE  5,  7:  PRINT  Max. cone  *  .9 
LOCATE  7,  7:  PRINT  Max. cone  *  .8 
LOCATE  9,  7:  PRINT  Max. cone  *  .7 
LOCATE  11,  7:  PRINT  Max. cone  *  .6 
LOCATE  13,  7:  PRINT  Max. cone  *  .5 
LOCATE  15,  7:  PRINT  Max. cone  *  .4 
LOCATE  17,  7:  PRINT  Max. cone  *  .3 
LOCATE  19,  7:  PRINT  Max. cone  *  .2 
LOCATE  21,  7:  PRINT  Max. cone  •  .1 
LOCATE  23,  7:  PRINT  0 
LOCATE  12,  2:  PRINT  UNITS*; 
LINE  (0,  0)-(600,  199),  ,  B 
LOCATE  2,  65:  PRINT  "RUN  #  ";  rn$; 
BEEP 

AAA$  =  "" 
WHILE  AAA$  =  "" 
AAAS  =    INKEYS 
WEND 
y  =  STICKd) 

y  =  180  -  (y  •  Yscale  /  Max. cone) 
PSET  (100,  y) 

IF  Flagfile  =  1  THEN  GOTO  point2 
FOR  n  =  2  TO  simlength 

FOR  JAY JAY  =  1  TO  DELAY:  NEXT 
ystick  =  STICK(I):  a  =  ystick 
correction  =  (((Yscale  -  0  ♦  1)  *  RND  ♦  0)) 
ystick  =  ((ystick  •  Yscale)  -  (Yscale  •  2)  - 
IF  ystick  <  0  THEN  ystick  =  0 
arraydat(n)  =  ystick 

'LOCATE  2,  50:  PRINT  USING  "#######";  arraydat(n);  ystick;  a; 
LINE  -(100  ♦  n,  180  -  (arraydat(n)  /  Max. cone)  •  164) 
NEXT  n 


WAIT  FOR  USER  TO  PRESS  A  KEY 
BEFORE  STARTING  TO  DRAW 
POLLUTOCRAPH. 


164 


(Yscale  /  2))  +  correction 


poini2: 

IF  Flagfile  =   1  THEN 

CALL  Draw2 

Flagfile  =  0 

END  IF 

CALL  Screensave 

asS  =  "" 

spec: 

ast  -   INKEYt 

SELECT  CASE  aat 

CASE  "S" 

CALL  Script 
CASE  "s" 

CALL  Script 
CASE  "P" 

SHELL  "ps" 
CASE  "p" 

SHELL    "ps" 
CASE    "" 
GOTO  spec 
CASE    ELSE 
END   SELECT 
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SUB  Drau2 

y  =  180  -  (arraydat(l)  /  Max. cone)  *  164 

PSET  (100,  y) 

FOR  n  =  2  TO  simlength 

LINE  -(100  +  n,  180  -  (arraydat(n)  /  Max. cone)  *  16A) 
NEXT  n 
Flagjk  =  O'default 

END  SUB 


SUB 

Ender 

CLOSE 

END 

END 

SUB 

SUB  Fileload  (arraydatO) 
CLS 

CALL  FRAME(6,  75,  8,  U) 

LOCATE  10,  10:  PRINT  "DATASET  MUST  BE  IN  FORM  <C:\DESIGN\DATAOO_.PRN>  TO  LOAD." 
LOCATE  11,  10:  PRINT  "IF  YOU  ENTER  WRONG  RUN  #  YOU  WILL  BE  RETURNED  TO  MAIN  MENU. 
LOCATE  12,  10:  INPUT  "ENTER  RUN  NUMBER  ONLY==========>",  n$ 

OPEN  "C:\DESIGN\DATAOO"  +  n$  +  ".PRN"  FOR  INPUT  AS  #1 
n  =  1 :  pres.max  =  0 

WHILE  EOF(I)  =  0 

INPUT  #1,  pres 
arraydat(n)  =  pres 

IF  pres  >  pres.max  THEN  pres.max  =  pres 
n  =  n  +  1  . 

WEND 

CLOSE 

CLS 

PRINT  "MAX.CONC  =  ";  pres.max 

CALL  waiter 

Flagfile  =  1 

CALL  Drawl 


END  SUB 

SUB  FRAME  (LEFTCOL%,  RIGHTCOL%,  TOPROW"/.,  BOTTOMROW%)  STATIC 
LOCATE  TOPROWTo,  LEFTCOL%:  PRINT  CHR$(201) 
LOCATE  TOPROW%,  RIGHTCOL%:  PRINT  CHR$(187) 
LOCATE  B0TT0MR0U°4,  LEFTCOL%:  PRINT  CHR$(200); 
LOCATE  BOTTOMROWX,  R1GHTC0L%:  PRINT  CHR$(188); 

FOR  VERTLINE%  =  TOPROWX  +  1  TO  BOTTOMROWX  -  1 

LOCATE  VERTLINE%,  LEFTCOL%:  PRINT  CHR$(186); 
LOCATE  VERTLINEX,  RIGHTCOL%:  PRINT  CHR$(186); 
NEXT  VERTLINE% 
HORIZLENGTH%  =  RIGHTCOL%  -  LEFTCOL%  -  1 
HOR!ZLINE$  =  STRING$(HORIZLENGTH%,  205) 
LOCATE  TOPROWX,  LEFTCOL%  +  1:  PRINT  HORIZLINES 
LOCATE  BOTTOMROW%,  LEFTCOL%  +  1:  PRINT  HORIZLINES; 
END  SUB 

'MENU.BAS 

SUB  Menu  (MENUCHOICES$( ),  NUMcHOsenX)  STATIC 

CLS 

SCREEN  0  _  5^  _ 


NUMOFCHOICES%  =   UBOUND(HENUCHOICES$) 

PROMPTS  =  "  " 

OKSTRINGS  =  "" 

LONGSTRINGX  =  0 

true%  =  -1 

falseX  =  0 

FOR  IX  =  1  TO  NUMOFCHOICESX 

FIRSTS  =  UCASE$(LEFT$(HENUCHOICES$(IX),  D) 

OKSTRINGS  =  0<STRING$  +  FIRSTS 

PROMPTS  =  PROMPTS  +  FIRSTS  +  "   " 

LTEMPX  =  LEN(MENUCHOICESS(IX)) 

IF  (LTEMPX  >  LONGSTRINGX)  THEN  LONGSTRINGX  =  LTEHPX 
NEXT  IX 

LONGSTRINGX  =  LONGSTRINGX  ♦  1 

PROMPTS  =  PROMPTS  +  "->" 

IF  LEN(PROMPTS)  >=  LONGSTRINGX  THEN  LONGSTRINGX  =  LEN(PROMPTS)  ♦  1 

LCX  =  37  -  (LONGSTRINGX  \  2) 

RCX  =  8C  -  LCX 

TCX  =  3 

BCX  =  10  +  NUMOFCHOICESX 

CALL  FRAME(LCX,  RCX,  TCX,  BCX) 

FOR  IX  =  1  TO  NUMOFCHOICESX 

LOCATE  6  +  IX,  LCX  ♦  3 

PRINT  UCASES(LEFTS(MENUCHOICES$(IX),  1))  +  ")"  +  MIDS(MENUCHOICESS( IX),  2) 

NEXT  IX 

LOCATE  4,  38:  PRINT  "MENU" 

LINES  =  STRINGS(LONGSTRINGX,  196) 

LOCATE  5,  LCX  ♦  3:  PRINT  LINES 

LOCATE  7  ♦  NUMOFCHOICESX,  LCX  +  3:  PRINT  LINES 

LOCATE  9  +  NUMOFCHOICESX,  LCX  +  3:  PRINT  PROMPTS; 

CTRLKEYSS  =  CHRS(13)  ♦  CHRS(27) 

DONEX  =  falseX 

WHILE  NOT  OONEX 

LOCATE  ,  ,  1 
CHARPOSX  =   0 

WHILE  CHARPOSX  =  0 

ANSS  =  INKEYS 

IF  (ANSS  <>  "")  THEN 

ANSS  =   UCASES(ANSS) 

CHARPOSX  =  INSTR(OKSTRINGS,  ANSS) 

IF  (CHARPOSX  =  0)  THEN  BEEP 
END  IF 


PRINT  ANSS 

LOCATE  11  ♦  NUMOFCHOICESX,  23,  0 

PRINT  "<ENTER>  TO  CONFIRM;  <ESC>  TO  REDO." 

NUMcHOsenX  =  CHARPOSX 

CHARPOSX  =  0 

WHILE  CHARPOSX  =  0 

ANSS  :  INKEYS 

IF  (ANSS  <>  "")  THEN 

CHARPOSX  =  INSTR(CTRLrEYSS,  ANSS) 

IF  (CHARPOSX  =  0)  THEN  BEEP  -  57  - 


END  IF 
WEND 
IF  (CHARPOS%  =  1)  THEN 

DONEX  =  true% 

CLS 


ELSE 


LOCATE  11  +  NUMOFCHOICES%,  23:  PRINT  SPACE$(35) 

LOCATE  9  +  NUMOFCHOICES%,  LC%  +  3  +  LEN(PROHPT$) :  PRINT  "  "; 

LOCATE  ,  POS(O)  -  1: 

END  IF 
UEND 
END  SUB 

SUB  Options  (OPTS$()) 
STARTAGAIN: 

CALL  Menu(OPTS$(),  CHOICE%) 
IF  CHOICE%  =  1  THEN 

CLS 

LOCATE  1,  20 

PRINT  "PRINTER  SETTINGS  INITIALIZED" 

LPRINT  CHR$(27);  "3"; 

LPRINT  CHRJ(7); 
END  IF 
IF  CHOICER  =  2  THEN 

CLS 

LOCATE  1,  20 

PRINT  "TINY  PRINT  SET" 

LPRINT  CHR$(27);  "3"; 

LPRINT  CHR$(27);  "3";  CHR$(17);  :  LPRINT  CHR$(15);  : 

LPRINT  CHR$(27);  "S";  "1"; 

LPRINT  CHR$(7); 
END  IF 

IF  CHOICER  =  3  THEN 
CLS 

LOCATE  10,  20 

INPUT  "ENTER  DRAW  DELAY  FACTOR  (1  TO  10). . .DEFAULT=1";  DELAY 
END  IF 

IF  CHOICE%  =  4  THEN  EXIT  SUB 
GOTO  STARTAGAIN 
END  SUB 

SUB  Options2  (analsSO) 
STARTAGAIN2: 

CALL  Menu(anals$(),  CHOICE%) 

IF  CHOICE%  =  1  THEN  CALL  Ani 

IF  CHOICE%  =  2  THEN  CALL  An2 

IF  CHOICE%  =  3  THEN  CALL  CORRELOGRAM 

IF  CHOICE%  =  4  THEN  CALL  AN3 

IF  CHOICEX  =  5  THEN  EXIT  SUB 

GOTO  STARTAGAIN2 
END  SUB 

SUB  Printfile  (arraydatO) 

INPUT  "ENTER  RUN  NUMBER  OR  LETTER======>",  a$ 

BS  =  "C:\design\DATA00"  +  a$  +  ".PRN" 

OPEN  B$  FOR  OUTPUT  AS  #1 

FOR  n  =  1  TO  simlength 

IF  arraydat(n)  <  0  THEN  arraydat(n)  =  0 

PRINT  #1,  arraydat(n)  -  58  - 


REM  PRINT  arraydat(n)   'remove  <rem>  to  get  screen  printout 

NEXT  n 

CLOSE 
CLS 

LCX:ATE  4,    18:   PRINT   "DESCRIPTIVE  STATISTICS" 

LOCATE   6,    20:    PRINT   "THE   MEAN    IS  "; 

PRINT  USING  "######.###";   xbar 

LOCATE  7,  20:  PRINT  "THE  SO  IS  "; 

PRINT  USING  "######.###";  xsd 

LOCATE  8,  20:  PRINT  "THE  MINIMUM  IS  "; 

PRINT  USING  "######.###";  MIN 

LOCATE  9,  20:  PRINT  "THE  MAXIMUM  IS  "; 

PRINT  USING  "mmtm.mtt";   max 

LOCATE  10,  20:  PRINT  "RANGE  IS  "; 

PRINT  USING  "*mmm.m»";    max  -  MIN 

LOCATE  12,  20:  PRINT  "THE  VARIANCE  OF  X  IS  "; 

PRINT  USING  "######.###";  Xvar 

LOCATE  12,  20:  PRINT  "THE  SKEUNESS  COEFFICIENT  IS     "; 
PRINT  USING  "  «#.##«#";  skew 

LOCATE  13,  20:  PRINT  "THE  KURTOSIS  COEFFICIENT  IS     "; 
PRINT  USING  "   ##.####";  Kurt 

LOCATE  U,  20:  PRINT  "THE  EXCESS  COEFFICIENT  IS 
PRINT  USING  "  ##.####";  excess 

LOCATE  15,  20:  PRINT  "THE  COEFFICIENT  OF  VARIATION  IS  "; 
PRINT  USING  "   ###.#";  V 

LOCATE  16,  20:  PRINT  "AUTOCORR.  COEFFICENT  (LAG=1)  IS  "; 
PRINT  USING  "   ##.####";  R 
CALL  FRAMEdS,  65,  2,  17) 
LPRINT  "" 

LPRINT  DATES;  :  LPRINT  "  ";  :  LPRINT  TIMES 
LPRINT  "" 

LPRINT    "DESCRIPTIVE  STATISTICS" 

LPRINT  "" 
LPRINT  BS 
LPRINT    "" 


LPRINT  "THE  MEAN  IS 

LPRINT  USING  "######.###";  xbar 

LPRINT  "THE  SD  IS 

LPRINT  USING  "######.###";  xsd 

LPRINT  "THE  MINIMUM  IS 

LPRINT  USING  "#««*««. «««";  MIN 

LPRINT  "THE  MAXIMUM  IS 

LPRINT  USING  "«#««#«.**«";  max 

LPRINT  "RANGE  IS 

LPRINT  USING  "««««#«.«««";  max  -  MIN 

LPRINT  "THE  VARIANCE  OF  X  IS 

LPRINT  USING  "««««««.«««";  Xvar 

LPRIMI  "THE  SKEUNESS  COEFFICItNT  IS 

LPRINT  USING  "   «#.###«";  skew 

LPRINT  "THE  KURTOSIS  COEFFICIENT  IS 

LPRINT  USING  "   »#.###«";  Kurt 

LPRINT  "THE  EXCESS  COEFFICIENT  IS 

LPRINT  USING  "  #«.«##«";  excess 

LPRINT  "THE  COEFFICIENT  OF  VARIATION  IS 

LPRINT  USING  "   «#•.«";  V 

LPRINT  "AUTOCORR.  COEFFICENT  (LAG=1)  IS 

LPRINT  USING  "   «t.»«#t";  R 
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END  SUB 

SUB  Prod  (matrixO,  samplenumX) 

FOR  LINES  =  1  TO  3 

NEXT 

sumx  =  0:  sumy  =  0:  MIN  =  99999:  max  =  0 

FOR  Z  =  1  TO  satrplernjn%       '   COMPUTE  SUM  OF  X  AND  SUM  OF  Y 

sumx  =  sumx  +  matrix(z) 

IF  z  =  1  THEN  sumy  =  sumy  +  matrix(samplenum%) 

IF  z  <>  1  THEN  sumy  =  sutiy  +  matrix(z  -  1) 

NEXT  z 


xbar  =  SLTix  /  samplenum%:  ybar  =  stmy  /  sanplenunX  '   COMPUTE  X  AND  Y  MEANS 

IF  LPS  =  1  THEN  MU  =  xbar 

IF  FLAG6  =  1  THEN  STANDIN  =  MU 

IF  FLAG6  =  2  THEN  STANDIN  =  xbar 

X  =  0:  y  =  0:  x2  =  0:  y2  =  0:  xy  =  0:  sumx2  =  0:  sumy2  =  0:  sumxy  =  0 

x3  =  0:  x4  =  0:  siiux3  =  0:  sumx4  =  0 

FOR  z  =  1  TO  samplenum% 

X  =  matrix(z)  -  STANDIN 

IF  z  =  1   THEN  y  =  matrix(sanplenuni%)   -   ybar 

IF  z  <>   1   THEN  y  =  matrixCz   -    1)    -   ybar 

x2  =  X  '  2 

x3  =  X  '  3 

x4  =  X  '  4 

y2  =  y  -  2 

xy  =  X  *  y 

sumx2  =  sumx2  +  x2 

sumx3  =  sumx3  +  x3 

sumxA  =  sumx4  +  xA 

sumy2  =  sumy2  +  y2 

sumxy  =  sumxy  +  xy 

NEXT  z 
xsd  =  SQR(sumx2  /  samplenum%) 
IF  xsd  =  0  OR  STANDIN  =  0  THEN 

BEEP 

PRINT  "ALL  SAMPLES  SELECTED  WERE  ZERO...  CASE  SKIPPED  FOR  FREO=";  LPS 

LPRINT  "ALL  SAMPLES  SELECTED  WERE  ZERO...  CASE  SKIPPED  FOR  FREQ=";  LPS 

EXIT  SUB 
END  IF 

ysd  =  S0R(sumy2  /  samplenum%) 

skew  =  sumx3  /  ((xsd  '  3)  *  samplenum%) 

Kurt  =  sumxA  /  ((xsd  '  4)  •  samplenumX) 

excess  =  Kurt  -  3 

Xvar  =  xsd  *  xsd:  yvar  =  ysd  *  ysd 

V  =  (xsd  /  STANDIN)  *  100 

IF  (sumx2  *  sumy2)  <>  0  THEN  R  =  sumxy  /  (SQR(sumx2  *  sumy2)) 
IF  (sumx2  *  sumy2)  =  0  THEN  R  =  1 
COVAR  =  sunxy  /  samplenum% 

VI  =  samplenum%  -  1 

cl  =  t975(V1)  *  ((xsd)  /  SQR(samplenum%)) 

Ucl  =  xbar  +  cl 

LCL  =  xbar  -  cl 

IF  LCL  <  0  THEN  LCL  =  0 

Range  =  2  *  cl 

COV  =  (cl  /  STANDIN)  *  100  _  ^Q     - 


REM  PRINT  USING  "##";  LPS.samplenonX; 

REM  PRINT  USING  "   #####.##";  XBAR,  XSO,  XVAR,  R,  CL 

SELECT  CASE  LPS 

CASE  1 

IF  sanplenunX  =  30  THEN  LPS$  =  "D" 
CASE  7 

LPS$  =  "W" 
CASE  999 

LPS$  =  "T" 
CASE  2.3 

LPS$  =  "T" 
CASE  ELSE 

LPS$  =  "-" 
END  SELECT 
LPRINT  "   "  ♦  LPS$; 
LPRINT  USING  "   «#  ";  samplenunX; 

LPRINT  USING  "   *»##.##";  Ucl,  xbar,  LCL,  Range,  xsd; 
LPRINT  USING  "  ######.##";  Xvar; 

LPRINT  USING  "   ####.##";  R,  skew,  excess,  ((xbar  -  HU)  /  MU)  *  100; 
LPRINT  USING  "     *#";  LPS 
END  SUB 

SUB  PR0C2  (arraydatO,  samplenij7i%) 

OPEN  "c:\DESIGN\autocor"  ♦  rn$  +  ".prn"  FOR  OUTPUT  AS  #2 

LPRINT  "CORRELOGRAM  DATA  WRITTEN  TO  C:\DESIGN\AUTOCOR"  +  rn$  *  "PRN" 

samplenunX  =  365 

fOR  lag  =  1  TO  180 

PRINT  "lag   loop  #";    lag 

sunx  =  0:   suny  =  0 

FOR  2  =  (lag  ♦   1)   TO  samplenaii%  '     COMPUTE  SUM  OF  X  AND  SUM  OF  Y 

sunx  =  sunx  ♦  arraydat(z) 

suny  =  suny  ♦  arraydat(z  -    lag) 

NEXT   z 

cotnt  =  samplenunX  -  lag 

xber  =  sunx  /  count:  ybar  =  suny  /  count   '  COMPUTE  X  AND  Y  MEANS 

X  =  0:  y  =  0:  x2  =  0:  y2  =  0:  xy  =  0:  sutw2  =  0:  suny2  =  0:  sunxy  =  0 

FOR  z  =  (lag  +  1)  TO  samplenunX 
X  =  arraydat(z)  -  xbar 
y  =  arraydat(z  -  lag)  -  ybar 
x2  =  X  •  X 

y2    =    y    •    y 

xy  «  X  •  y 
sunx2  =  sunx2  *  x2 
suny2  =   suny2  ♦   y2 
sunxy  =  sunxy  ♦  xy 
NEXT   z 


IF   (sunj<2  *   sunv2)   <>   0   then   R   =   sunxy  /   (S0R(sunx2   •   suny2)) 

IF    (sunx2  •   suny2)   =   0   THEN   R   =   1 
PRINT  «2,    lag,   R 
NEXT    lag 

END  SUB 

'    SUBPROGRAM  READ  IN 
SUB  REAOIN  (arraydatO) 
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LPRINT  "  " 

OPEN  "c:\DESIGN\DATAOO"  +  rn$  +  ".PRN"  FOR  INPUT  AS  #1 

LPRINT  "DATASET  "  +  "  c:\DESIGN\DATAOO"  +  rn$  +  ".PRN" 

LPRINT  "" 

n  =  1 

'READ  DATA  INTO  ARRAYDAT 


WHILE  E0F(1)  =  0 

INPUT  #1,  pres 

arraydat(n)  =  pres 

n  =  n  +  1 

WEND 

CLOSE 

PRINT 

"  " 

PRINT 

"RAW  DATA  LOADED  INTI 

'ARRAYDAT'" 

INPUT  "PRINT  OUT  SIMULATED  DATA  SET  ?  (Y/y)",  a$ 
IF  a$  =  "Y"  THEN  GOTO  PRINT1 
IF  a$  =  "y"  THEN  GOTO  PR INTI 
GOTO  SKIP2 
PRINT1: 
FOR  X  =  1  TO  365 

'LPRINT  USING  "      ###";  x; 

LPRINT  USING  "######";  arraydat(x); 

NEXT 

BEEP 

LPRINT  CHR$(12); 

aa$  =  "" 

PRINT  "CHANGE  PAPER  AND  PRESS  A  KEY  WHEN  READY" 

WHILE  aa$  =  "» 

aa$  =  INKEY$ 
WEND 

LPRINT  "SIMULATION  RUN  NUMBER  ";  rn$ 
LPRINT  DATES 
LPRINT  TIMES 
LPRINT  "" 
SKIP2: 
END  SUB 

SUB  readin2  (arraydatO) 

CLS 

PRINT  "PRESENCE/ABSENCE  ANALYSIS  PROGRAM" 

INPUT  "ENTER  RUN  NUMBER  ============>";  rnS 

LPRINT  "SAMPLING  DATASET  'C:\DESIGN\DATAOO"  +  rnS  +  ".PRN'" 

INPUT  "ENTER  DETECTION  LIMIT  =======>";  DL 

LPRINT  "  " 

LPRINT  "SELECTED  DETECTION  LIMIT  =";  DL 

PRINT  "  " 

LPRINT  "  " 

OPEN  "c:\DESIGN\DATAOO"  +  rnS  +  ".PRN"  FOR  INPUT  AS  #1 

REM  NUM1  IS  THE  NUMBER  OF  VS  OR  NUMBER  OF  ABOVE  DETECTION  LIMIT  DAYS 
n  =  1:  Nunl  =  0 

WHILE  E0F(1)  =  0 

INPUT  #1,  pres 

temp  =  pres 

IF  pres  <  DL  THEN  pres  =  0 

IF  pres  >=  DL  THEN  pres  =  1 

IF  pres  =  1  THEN  Nunl  =  Numl  +  1 
REM     PRINT  N,  temp,  pres  -  62  - 


arraydat{n)  =  pres 
n  =  n  +  1 
WEND 
CLOSE 


SUB  READIN3  (arraydatO) 
PRINT  "  " 

LPRINT  "  " 

OPEN  "c:\DESIGN\DATAOO"  +  rn$  +  ".PRN"  FOR  INPUT  AS  #1 

LPRINT  "DATASET  "  +  "  c:\DESIGN\DATAOO"  +  rn$  +  ".PRN" 

LPRINT  "" 

n  =  1 

WHILE  E0F(1)  =  0  'READ  DATA  INTO  ARRAYDAT 

INPUT  #1,  pres 

arraydat(n)  =  pres 

n  =  n  +  1 

WEND 

CLOSE 

PRINT  "  " 

PRINT  "RAW  DATA  LOADED  INTO  'ARRAYDAT'" 
INPUT  "PRINT  OUT  SIMULATED  DATA  SET  ?  (Y/y)",  a$ 
IF  a$  :  "Y"  THEN  GOTO  PRINT2 
IF  aS  =  "y"  THEN  GOTO  PRINT2 
GOTO  SKIP3 
PRINT2: 

FOR  X  =  1  TO  365:  LPRINT  USING  "       ###";  x;  :  LPRINT  USING  "itiht       ";  arraydat(x),-  :  NEXT 
BEEP 

LPRINT  CHR$(12); 
aaS  =  "" 

PRINT  "CHANGE  PAPER  AND  PRESS  A  <EY  WHEN  READY" 
WHILE  aa$  =  "" 

aaS  =  INKEYS 
WEND 

LPRINT  "SIMULATION  RUN  NUMBER  ";  rn$ 
LPRINT  DATES 
LPRINT  TIMES 
LPRINT  "" 
SICIP3: 


SUB  Refresh 

SCREEN  2 

DEF  SEG  =  &HB800 

BLOAO  "C:\design\PICTURE",  0 

CALL  waiter 
END  SUB 

SUB  Sample  (arraydatO,  motrixO,  samplenunX) 
J  •  0 

FOR  n  «  ((MONTH  •  30)  •  29)  TO  (MONTH  •  30) 
IF  n  MOO  LPS  =  0  THEN 
J  =  J  ♦  1 

nv9trix(j)  «  arraydat(n) 
END  IF 

sanplentinX  :  j 
CLOSE  -  63  - 


NEXT 
END  SUB 

SUB  Screensave 

'LOCATE  1,  65:  PRINT  "  " 

'LOCATE  2,  65:  PRINT  "  " 

DEF  SEG  =  &HB800 

BSAVE  "C:\desi9n\PICTURE",  0,  &H4000 

END  SUB 

SUB  Script 

B$  =  "C:\design\DATAOO"  +  rn$  +  ".PRN" 

OPEN  B$  FOR  OUTPUT  AS  #1 

FOR  n  =  1  TO  simlength 

IF  arraydat(n)  <  0  THEN  arraydat(n)  =  0 

PRINT  #1,  arraydat(n) 

REM  PRINT  arraydat(n)        ' renxjve  <rem>  to  get  screen  printout 

NEXT   n 

CLOSE 

LPRINT  "DATA  SET  WRITTEN  TO  ";  B$ 
END  SUB 

SUB  SuTjiiarize 

CLS 

SCREEN  0 

LOCATE  1,  65 

PRINT  "PLEASE  WAIT..." 

sutnx  =  0:  sumy  =  0:  MIN  =  99999:  max  =  0 

FOR  Z  =  1  TO  365        '   COMPUTE  SUM  OF  X  AND  SUM  OF  Y 
sumx  =  sumx  +  arraydat(z) 
IF  arraydat(z)  <  MIN  THEN  HIN  =  arraydat(z) 
IF  arraydat(z)  >  max  THEN  max  =  arraydat(z) 
IF  z  >  1  THEN  sumy  =  sumy  +  arraydatCz  -  1) 


xbar  =  SLinx  /  365:  ybar  =  sumy  /  365   '   COMPUTE  X  AND  Y  MEANS 

X  =  0:  y  =  0:  x2  =  0:  x3  =  0:  x4  =  0:  y2  =  0:  xy  =  0: 

sumx2  =  0:  sumx3  =  0:  sumx4  =  0:  sumy2  =  0:  sunxy  =  0:  sumx3  =  0 

FOR  z  =  1  TO  365 

X  =  arraydat(z)  -  xbar 

IF  z  >  1  THEN  y  =  arraydatCz  -  1)  -  ybar 

x2  =  X  '  2 

x3  =  X  -  3 

x4  =  X  -  4 

y2  =  y  *  y 
xy  =  X  *  y 
sumx2  =  sumx2  +  x2 
sumx3  =  sumx3  +  x3 
sumx4  =  sumx4  +  x4 
sumy2  =  sumy2  +  y2 
sijnxy  =  sumxy  +  xy 
NEXT  z 

xsd  =  SQR(sumx2  /  365) 

ysd  =  SQR(sumy2  /  365) 

skew  =  sumx3  /  ((xsd  '  3)  *  365)  'FOR  N=365  ONLY! 

Kurt  =  sumx4  /  ((xsd  '  4)  *  365)  -  64  - 


x.ar  =  xsd  *  xsd:  yvar  =  ysd  •  ysd 

V  =  (xsd  /  xbar)  *  100 

excess  =  Kurt  -  3 

R  =  SLiiwy  /   (SQR(SLinx2  *  sunyZ)) 

CLS 

LOCATE  4,    18:    PRINT   "DESCRIPTIVE  STATISTICS" 

LOCATE  6,  20:  PRINT  "THE  MEAN  IS  "; 

PRINT  USING  "######.###";  xbar 

LOCATE  7,  20:  PRINT  "THE  SO  IS  "; 

PRINT  USING  "######.##«";  xsd 

LOCATE  8,  20:  PSINT  "THE  MINIMUM  IS  ■; 

PRINT  USING  "#*####.###";  MIN 

LOCATE  9,  20:  PRINT  "THE  MAXIMUM  IS  "; 

PRINT  USING  "######.###";  'nax 

LOCATE  1C,  20:  PRINT  "RANGE  IS  "; 

PRINT  USING  "##«###.#«#";  max  -  MIN 

LOCATE  11,  20:  PRINT  "THE  VARIANCE  OF  X  IS  "; 

PRINT  USING  "######.###";  Xvar 

LOCATE  12,  20:  PRINT  "THE  SKEWNESS  COEFFICIENT  IS 

PRINT  USING  "  ««. «###";  skew 

LOCATE  13,  20:  PRINT  "THE  KURTOSIS  COEFFICIENT  IS 

PRINT  USING  "  ##.#«##";  Kurt 

LOCATE  U,  20:  PRINT  "THE  EXCESS  COEFFICIENT  IS 

PRINT  USING  »  ##.####";  excess 

LOCATE  15,  20:  PRINT  "THE  COEFFICIENT  OF  VARIATION  IS  "; 

PRINT  USING  "  *»#.#";  V 

LOCATE  16,  20:  PRINT  "AUTOCORR.  COEFFICENT  (LAG=1)  IS  "; 

PRINT  USING  "  ««.##««";  R 

CALL  FRAMEdS,  65,  2,  17) 

CALL  waiter 


SUB  Thrice  (arraydatO,  matrixO,  samplenanX) 

LPS  =  999  'RESET  SO  THAT  MU  IS  NOT  RECALCULATED 

FLAG  =  -1 
J  =  0 

FOR  n  =  ((MONTH  •  30)  -  29)  TO  (MONTH  •  30) 
IF  n  MOO  7  =  0  THEN 

FLAG  -  FLAG  •  -1 
GOTO  SKIP 
END  IF 
SELECT  CASE  FLAG 
CASE  -1 

IF  n  MOO  2  =  0  THEN 
J  =  J  ♦  1 

matrix(j)  *   arraydat(n) 
REM  LPRINT  N,  i»atrix(j) 

END  IF 
CASE  1 

IF  (n  •  1)  MOO  2  =  0  THEN 
j  «  j  ♦  1 

■■trix(j)  *   •rraydat(n) 
REM         LPRINT  N,  Mtrix(j) 
END  IF 
END  SELECT 

If  j  =  0  THEN  ca^o  SKIP 

SKIP: 
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saiiiplenum%  =  j 
NEXT 
LPS  =  2.3 

END  SUB 

SUB  waiter 

LOCATE  25,  65 

PRINT  "PRESS  A  KEY... 

aa$  =  INKEYS 

WHILE  aa$  =  " 
aa$  =  INKEYS 
WEND 

END  SUB 
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tt 

#  TtiiB  macro  generates  a  binary  (0/1)  series  (e.g.,  a  time  series)  having 

#  specified  parameters.  Then  a  macro  ('BINSER3.DAT')  c&n    be  executed  to 

#  to  BAialvze    the  series  and  to  evaluate  thp  efficiency  of  various;. 

#  sampling  schcemes.  For  more  detail,  leave  hINITAB  and  read  the  file 

#  'BINSEF^.DOC  . 
M 

#  Enter  "LET  Kl  -  the  mean  mu  of  the  series,  i.e.  the  true  long-term  mean 

#  probability  of  state  1  occurring" 

#  Enter  "LE"T  K2  ~  the  autocorrelation  coefficient  rho  (lag  1)" 

#  Enter  "LET  K3  -  the  length  N  of  the  series  to  be  generated" 

#  Enter  "EXEC  'B1NGER1.DAT'  " 
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#  The  true  long-term  mean  probability  of  =5tate  1  ia: 
PF;INT  Kl 

#  The  autocorrei ati on  coef i i ci ent  i&: 
PRINT  K2 

#  Thc:>  length  of  the  series  to  be  generated  is: 
PRINT  K3 

LET  Ka^4<3-1 

LET  K6==KS*i<2*K2 

LET  K7=1-K1 

# 

#  The    e?xpectecJ    chi -square     (1    df )     is: 
PRINT    K6 

LET  K9=1--KMK2*K1 
LET  K10-1-K9 
BET  C4 

0     1 
END 
SET  C5 

l<9  KIO 
END 
Oi-NO 

LET  K10=KM-K2«  (1-Kl) 
LET  K9=l-K10 
BET  C6 

fj  1 
END 
SET  C7 

K9  KiO 
END 
SET  C8 

Ci  1 
END 
SET  C9 

K7  Kl 
END 
RANDOM  1  CIO: 

DISCRETE  Ca  C9. 
LET  K9-1 

EXEC  'BINSER2.DAT'  KB  TIMES 
0H^24 
# 

#  The  series  has  been  generated,  and  is  in  column  CIO.  If  you  want 

#  to  analyze  it,  enter  "EXEC  'BINSER3.DAT   ". 
# 
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BINSER2.DAT 


Thursday,  July  14,  19BB 


F'aqe  1 


LET  K10=4+2«C10<K9) 
LET  K11=K10+1 
RANDOM  1  Cll; 

DISCRETE  CKIO  CKll 
STACK  CIO  Cll , CIO 
LET  K9=K9+1 
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# 

tt  You  ma/  be  executing  this  macro  as  a  follow-up  to  ' BI NBERl . DAT ' 

#  which  would  have  been  used  to  generate  a  binary  series  with 

#  specified  paxrameters  and  to  store  it  in  column  CIO.  If  so,  you  will 

#  interpret  the  following  analysis  and  evaluation  as  a  description  of  a 

#  simulated  series  which  has  known  (specified)  parameters.  Thus  the 

#  sample  estimates  of  the  parameters  (from  'BINSER3.DAT')  can  be 

#  compared  to  their  known  values  (the  input  to  'BINSER1.DAT'). 
# 

#  Alternatively,  the  binary  series  (which  must  be  stored  in  column  CIO) 
tt    could  be  a  real  monitoring  data  set  where  state  1  represents 

tt  "detectable"  or  "violation  of  the  standard"  and  state  0  represents 

#  "below  detectable"  or  "not  in  violation  of  the  standard".  If  this  is 

#  the  case,  then  the?  following  analysis  and  evaluation  is  for  data 

#  whose  parameters  B.r6    not  known  but  for  which  estimates  ara    desired. 

#  You  may  wish  to  use  these  parameter  estimates  (of  mu  and  rho  ) 

#  as  input  to  '  B1NBE:R1  .  DAT  '  ,  to  simulate  binary  series  having  these 

#  properties,  and  then  to  run  '  BINSER3'. .  DAT  '  again  to  evaluate  the 
tt  et  f  1  c  i  K^fic  V  of  various  sampling  schemes  for  use  in  sampling  such 

#  m  (":i  1 1  i  1 1 ,1  r  i  n  q  data. 

#  N o w  WE  w i  ].  1  analyze  the  b  i  n a r  y  series: 
PRINT  CIO 

TSPLOT  CIO 

MEAN  CIO 

RUNE  0.5  CIO 

LET  C13(1)=C10(1) 

LET  k  12=^1 

LET  K  13^=^2 

LET  K3==C0L)NT(C10) 

LET  K8:^4<3-1 

DH=-0 

EXEC  'B1NSER4.DAT   LS  TIMES 

DH~24 

CODE  (-1)0  CI 3, CI 3 

PARSUM  C13,C14 

LET  C15^C10*C14 

LET  K12--=MAX1  (CI 5) 

# 

#  The  number  of  runs  of  1  is: 
PRINT  k:i2 

tt  Here  is  the  binary  series  again,  with  the  I's  replaced  by  the  run 

tt    number   (1st,  2nd,  run  of  I's): 

PRINT  Clf. 
ACF  5  CIO 
COPY  CIO  CI  2; 

OMIT  1:1. 
COPY  CIO  Cll; 

OMIT  K3:K3. 
CORR  Cll  CI 2, Ml 
COPY'  Ml  C12-C13 
LEI"  I  :9=--^C13(l  ) 
LET  K10=K8*K9*k:9 

#  The  autocorrelation  coefficient  and  the  chi-square  (1  df )  am: 
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PRINT  K9  KIO 

tt  Now  we--  only  observe  at  intervals  (e.q,  the  series  is  daily  and  you 

#  are  sampling  every  so  many  days). 

#  LET  K14  ~    The  sampling  (observation)  interval  you  want,  e.g.  3  it  you 

#  want  every  3  days,  and  then  enter  "EXEC  'BINSER5.DAT'  ". 
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BINSER4.DAT 


Thursday,  July  14,  19S8 


F'aqe    ] 


LET  C13(K 
LET  K12^[ 
LET    K13-f 


13)  =C10(K13)  -C10(I<1: 

12-H 

13+1 
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# 

#  The  sampling  interval  is  now: 
PRINT  K14 

LET  K9=RDUND  <K3/K14+0.  5) 

LET  K15=K14-1 

SET  C12 

K15(0) 

END 

STACK  CI 2  1 ,C12 

0H=0 

EXEC  'BINSER6.DAT'  K9  TIMES 

0H=24 

COPY  Cll  Cll; 

USE  1:K3. 
LET  C12=C15*C11 

NAME  CIS   SERIES', Cll  ' OBSERVE ', CI 2   DETECTED' 
# 

#  Following  3.re    the  generated  series  (col.l),  observation  times  (col.  2). 

#  and  occurrences  o-f  1  that  a^re    detected  (col.  3m 
PRINT  CIS  Cll  C12 

LET  C1(S=C11*C15 
# 

#  The  number  o-F  runs  of  1  that  were  missed  by  a  given  sampling  scheme  can 

#  be  calculated  as  the  number  of  runs  ot  1  minus  the  number  of  non-zero 

#  categories  listed  in  the  following  TALLYs  of  the  data. 
TALLY  CI 6 

# 

#  You  can  changt  the  value  stored  in  K14  and  "EXEC  'BINSER5.DAT'  ''■    again. 


-  73  - 


BINSER6. DAT 
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STACK  Ci2  cu  ,c:i  1 
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APPENDIX  C 

SIMULATED  DATA  SETS  USED  AS 
EXAMPLES  IN  THE  REPORT 
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3.  DATA  USED  TO  REPRESENT  DIFFERENT  LEVELS  OF 
INDUSTRIAL  VARIABILITY  61 


CI:  TIME  SERIES  SCATTERGRAMS 

Following  are  plots  of  concentration  vs.  time  for  the  19  simulated  data  sets  used  as 
examples  in  the  report. 
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TIME   SERIES   SCATTERPLOT   FOR    RUN    13 
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C2:  DESCRIPTIVE  STATISTICS 

Following  are  basic  descriptive  statistics  for  the  19  simulated  data  sets  used  in  the 
report.  The  statistics  were  calculated  using  STATPAC.  The  data  sets  are  contained 

on  the  distribution  disk. 
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STUlVriS/lyVRY      OF'      SIKlljrLyvTEi:>      D^VTA^S  ETS . 


RUN  #  KOLMOGOROV-SMIRNOV 

STATISTIC 


1  2.5434 

2  5.4244 

3  5.6324 

4  5.0121 

5  0.9643 

6  1.0545 

7  1.4440 

8  4.2958 

9  0.7927 

10  0.8333 

11  5.1572 

12  3.8969 

13  1.0032 

14  2.5905 

15  3.8268 

16  0.8071 

17  1.5542 

18  1.1431 

19  6.3586 


The   Kolmorgorov-Smirnov   statistic   provides  a   quick   check   to  determine 
the   degree  of  normality  in   a   dataset.   The  value   provides  a   relative 
indication  of  normality;  as  the  value  moves  further  from  zero  we  can  be 
more  certain   that  the  data  do  not  approximate  a  normal   distribution.   The 
distribution  is  non-normal  at  the  .025  level  if  KS>.955. 

Using  this  criterion  run  numbers   5,   9,   10  and   16  are  considered   to  be 
approximately  normally  distributed. 
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StatPac  Gold  Statistical  Analysis  Package 

DESCRIPTIVE  STATISTICS  FOR  RUN  01 
concent  rat  ion 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


1010.56 
1010.5600 
139552.3106092 
382. 3351 
380.7280 
0 

91134. 3712 
301 .8847 
=   15.8231 


95  Percent  confidence  interval  around  the  mean  =   351.3219  -  413.3483 

Variance  (unbiased)  =   91384.7403 

Standard  deviation  (unbiased)    -   302.2991 

Skewness       =   0. 1249 

Kurtosis       =       1.6623 

Ko Imogorov-Smi rnov  statistic  for  normality  =   2.5434 


Val i  d  cases       =365 
Missing  cases     -  0 
Response  percent  -     100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  01 
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batPac  Gold  Statistical  Analysis  Package 

iSCRIPTIVE  STATISTICS  FOR  RUN  02 
jncent rat  ion 


Minimum  = 

Maximum  = 

Range  = 

Sun  = 

Mean  = 

Median  = 
Mode 

Variance  = 

Standard  deviation  = 

Standard  error  of  the  mean 


157.5583 
157.5583 
7395.5723875 
20.2618 
12. 1780 
0 

706.0604 
26.5718 
=   1.3927 


95  Percent  confidence  interval  around  the  mean  =   17.5321 

Variance  (unbiased)  =   708.0001 

Standard  deviation  (unbiased)    =   26.6083 

Skewness       =   2.6104 

Kurtosis       =   10.0423 

Kolmogorov-Smi rnov  statistic  for  normality  =   5.4244 


22.9916 


i  1  i  d    cases  =    365 

Lssing    cases  =    0 

■sponse    percent    =    100.0    * 
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DESCRIPTIVE  STATISTICS  FOR  RUN  02 
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StatPac  Gold  Statistical  Analysis  Package 

DESCRIPTIVE  STATISTICS  FOR  RUN  03 
concent  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  de\iation  = 
Standard  error  of  the  mean 


0 

299.0852 
299. 0852 
20758. 9140527 
56. 8737 
6. 2462 
0 

6122.0974 
78.2438 
=   4.1011 


95  Percent  confidence  interval  around  the  mean  =   48.8356  -  64.9119 

Variance  (unbiased)  =   6138.9164 

Standard  de\iation  'unbiased)    =   78.3512 

Skewness       =   1.0821 

Kurtosis       =   2.6903 

Kolmogorov-Smirnov  statistic  for  normality  =   5.6324 


Valid  cases 
Missing  cases 
Response  percent 


=  365 

=  0 

^  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  03 
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StatPac  Gold  Statistical  Analysis  Package 

DESCRIPTIVE  STATISTICS  FOR  RUN  04 
concen  t  rat  ion 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  - 

Median  = 

Mode  - 

Variance  = 

Standard  deviation  - 

Standard  error  of  the  mean 


885.0707 
885.0707 
39388.6643124 
107. 9141 
15.3875 
0 

27753.8737 
166.5  94  9 
=   8.7319 


95  Percent  confidence  interval  around  the  mean  =   90.7995  -  125.0288 

Variance  (unbiased;  =   27830.1206 

Standard  deviation  (unbiased)    -   166.8236 

Skewness       =   2 . 1538 

Kurtosis       =   8.1432 

Ko Imogorov-Smi rnov  statistic  for  normality  =   5.0121 


Valid  cases       =  365 
Missing  cases     ^  0 
Response  percent  -  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  04 
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DESCRIPTIVE  STATISTICS  FOR  RUN  05 
concent  ration 


Min  imum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


229.651 
552.093 
322.4420 
151210. 1404 
414.2744 
418.4287 
Multi-Modal 
2704.5700 
52.0055 
-       2.7258 


95  Percent  confidence  interval  around  the  mean  -       408.9317  -  419.6170 

Variance  (unbiased)  =   2712.0001 

Standard  deviation  (unbiased)    -   52.0769 

Skewness       =  -0. 1060 

Kurtosis       =   2.9156 

Ko 1 mogoro v-Sm i rnov  statistic  for  normality  =   0.9643 


Val i  d  cases       =365 
Missing  cases     -  0 
Response  percent  =  100.0  X 
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DESCRIPTIVE  STATISTICS  FOR  RUN  05 
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DESCRIPTIVE  STATISTICS  FOR  RUN  06 
concent  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Med  i  an  = 

Mode  = 

Variance  = 

Standard  de\iation  = 
Standard  error  of  the  mean 


118.2183 
1900 

1781.7817 
159334. 1992 
436.5321 
435. 2805 
Multi-Modal 
21042.6954 
145. 0610 
=   7.6033 


95  Percent  confidence  interval  around  the  mean  =   421.6297  -  451.4344 

Variance  (unbiased)  =   21100.5050 

Standard  deviation  (unbiased)    =   145.2601 

Skewness       =   2.7177 

Kurtosis       -   29.6985 

Ko Imogo rov - Sm i r no V  statistic  for  normality  =   1.0545 


Valid  cases       -365 
Missing  cases     -  0 
Response  percent  =  100.0  * 
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DESCRIPTIVE  STATISTICS  FOR  RUN  06 
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DESCRIPTIVE  STATISTICS  FOR  RUN  07 
concent  rat  ion 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


17.90778 
1300 

1282.0922 
131298.80505 
359.7228 
333.3117 
Multi-Modal 
24350.9750 
156. 0480 
=   8.1791 


95  Percent  confidence  interval  around  the  mean  =   343.6917  -  375.7539 

Variance  (unbiased)  =   24417.8733 

Standard  deviation  (unbiased)    =   156.2622 

Skewness       -   0.7851 

Kurt  OS  is       =   5.4773 

Kolmogorov-Smi rnov  statistic  for  normality  =   1.4440 


Valid  cases       =  365 
Missing  cases     =  0 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  07 
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DESCRIPTIVE  STATISTICS  FOR  RUN  08 
concent  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


0 

440 
440 

16885.4836029 
46.2616 
36.6346 
0 

2384.2627 
48.8289 
=   2.5593 


95  Percent  confidence  interval  around  the  mean  =   41.2453  -  51.2779 

Variance  (unbiased)  =   2390.8129 

Standard  deviation  (unbiased)    =   48.8959 

Skewness       =   1.5427 

Kurtosis       =   12.4241 

Ko  Imogorov-Smi rnov  statistic  for  normality  =   4.2958 


Valid  cases       =  365 
Missing  cases     -  0 
Response  percent  =  100.0  * 
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DESCRIPTIVE  STATISTICS  FOR  RUN  08 
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DESCRIPTIVE  STATISTICS  FOR  RUN  09 
concent  rat  ion 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  de\-iation  = 
Standard  error  of  the  mean 


0 

154 
154 
27855 
76.3151 
78 
66 

823. 1966 
28.6914 
=   1.5038 


95  Percent  confidence  interval  around  the  mean  =   73.3675  -  79.2626 

Variance  (unbiased)  =   825.4582 

Standard  deviation  (unbiased)    =   28.7308 

Skewness       -   0.0069 

Kurtosis       =   2.5471 

Kolmogorov-Smi rnov  statistic  for  normality  =   0.7927 

Valid  cases       =  365 
Missing  cases     =  0 
Response  percent  =  100.0  % 


-  39  - 


StatPac  Gold  Statistical  Analysis  Package 


DESCRIPTIVE  STATISTICS  FOR  RUN  09 


Number 
of  Cases 


33  I 

32  I 

31  I 

30  I 

29  I 

28  I 

27  I 

26  I 

25  I 

24  I 

23  I 

22  I 

21  I 

20  I 

19  I 

18  I 

17  I 

16  I 

15  I 

14  I 

13  I 

12  I 

11  I 

10  I 

9  I 

8  I 

7  I 

6  I 

5  I 

4  I 

* 

3  I* 

* 

2  I 

1  I  * 

0 

*  * 


** 


50 


*** 


**  * 

—  + +  - 

100        150 


concent  rat  ion 
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StatPac  Gold  Statistical  Analysis  Package 

DESCRIPTIVE  STATISTICS  FOR  RUN  10 
concent  ration 


Min  imum 

Max  imum 

Range 

Sum 

Mean 

Median 

Mode 

Variance 

Standard  deviation 


=  127.1027 

=  951.273 

=  824.1703 

=  171967.2661 

=  471.1432 

=  470.7827 

=  Multi-Modal 

=  22402.5133 

=  149.6747 


Standard  error  of  the  mean   =  7.8451 

95  Percent  confidence  interval  around  the  mean  =   455.7668  -  486.5196 

Variance  (unbiased)  =   22464.0586 

Standard  deviation  (unbiased)    =   149.8801 

Skewness       =   0.2965 

Kurtosis       =   2.7661 

Kolmogorov-Smi rnov  statistic  for  normality  =   0.8333 

Valid  cases       =  365 
Missing  cases     =  0 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  10 


Number 
of  Cases 
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StatPac  Gold  Statistical  Analysis  Package 

DESCRIPTIVE  STATISTICS  FOR  RUN  11 
concent  rat  ion 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


779.9222 
4563.007 
3783.0848 
454213.9525 
1244.4218 
1081. 9090 
Multi-Modal 
239345. 6321 
489.2296 
=   25.6426 


95  Percent  confidence  interval  around  the  mean  =   1194.1624  -  1294.6813 

Variance  (unbiased)  =   240003.1750 

Standard  deviation  (unbiased)    =   489.9012 

Skewness       =   3.2702 

Kurtosis       =   15.4785 

Kolmogorov-Smi rnov  statistic  for  normality  =   5.1572 

Valid  cases       =  365 
Missing  cases     =  0 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  11 
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DESCRIPTIVE  STATISTICS  FOR  RUN  12 
concent  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Medi  an  - 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


0 

1027.326 
1027.3260 
82658.5856136 
226.4619 
190.  1387 
0 

55686. 1177 
235. 9791 
=   12.3687 


95  Percent  confidence  interval  around  the  mean  =   202.2193  -  250.7045 

Variance  (unbiased)  =   55839.1015 

Standard  deviation  (unbiased)    =   236.3030 

Skewness       =   0.7969 

Kurtosis       =   2.7959 

Kolmogorov-Smi rnov  statistic  for  normality  =   3.8969 


Valid  cases       =  365 
Missing  cases     -  C 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  12 
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StatPac  Gold  Statistical  Analysis  Package 

DESCRIPTIVE  STATISTICS  FOR  RUN  13 
concent  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


2.522949 
4857.689 
4855. 1661 
1033004.893779 
2830. 1504 
2901.6680 
Multi-Modal 
919733.4721 
959.0274 
=   50.2667 


95  Percent  confidence  interval  around  the  mean  =   2731.6277  -  2928.6731 

Variance  (unbiased)  =   922260.2124 

Standard  deviation  (unbiased)    =   960.3438 

Skewness       =  -0 . 3201 

Kurtosis       =   2.7190 

Kolraogorov-Smi rnov  statistic  for  normality  =   1.0032 


I'alid  cases       =  365 
*1issing  cases     -  0 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  13 
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ESCRIPTIVE  STATISTICS  FOR  RUN  14 
oncent  rat  i  on 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  - 

Median  = 

Mode 

Variance  = 

Standard  deviation  = 

Standard  error  of  the  mean 


36 

249.3818 
213.3818 

74652.80760000001 
204.5282 
206.0482 
Multi-Modal 
275.0866 
16.5857 
=   0.8693 


95  Percent  confidence  interval  around  the  mean  =   202.8244  -  206.2321 

Variance  (unbiased)  =   275.8423 

Standard  deviation  (unbiased)    -   16.6085 

Skewness       =  -3.0920 

Kurtosis       =   31.5423 

Kolmogorov-Smi rnov  statistic  for  normality  =   2.5905 


al i  d  cases       =  365 
issing  cases     =  0 
esponse  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  14 
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;tatPac  Gold  Statistical  Analysis  Package 

lESCRIPTIVE  STATISTICS  FOR  RUN  15 
:oncen t  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 
Mode 

Variance  = 

Standard  deviation  = 

Standard  error  of  the  mean 


910.9802 
12340 

11429.0198 
583891.5258 
1599.7028 
1546.9790 
Mult  i-Modal 
395959.8395 
629.2534 
=   32.9818 


95  Percent  confidence  interval  around  the  mean  =   1535.0583  -  1664.3472 

Variance  (unbiased)  =   397047.6413 

Standard  deviation  (unbiased)    =   630.1172 

Skewness       =   13.6274 

Kurtosis       =   232.6578 

Kolmogorov-Smi rnov  statistic  for  normality  =   3.8268 


al i  d  cases       =  365 
issing  cases     -  0 
esponse  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  15 
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DESCRIPTIVE  STATISTICS  FOR  RUN  16 
concentrat  ion 


Minimum 

^ 

0 

Maximum 

= 

140 

Range 

- 

140 

Sum 

= 

25863 

Mean 

= 

70.8575 

Median 

= 

69 

Mode 

= 

0 

Variance 

= 

1075.2509  . 

Standard 

devi  at 

ion 

= 

32.7910 

S  t andard 

error 

of 

the 

mean 

=   1.7187 

95  Percent  confidence  interval  around  the  mean  =   67.4889  -  74.2262 

Variance  (unbiased)  =   1078.2049 

Standard  deviation  (unbiased)    =   32.8360 

Skewness       =  -0.2098 

Kurtosis       =   2.5946 

Ko Imogorov-Smi rnov  statistic  for  normality  =   0.8071 


I  Val id  < 

I  Mi  ss  i  ni 


cases       =  365 
g  cases     =  0 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  16 
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DESCRIPTIVE  STATISTICS  FOR  RUN  17 
concent  ration 


Minimum  = 

Maximum  - 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  = 

Variance  = 

Standard  deviation  = 
Standard  error  of  the  mean 


6578 
17146 
10568 
4449681 
12190.9068 
12445 

Multi-Modal 
3040268.0845 
1743.6365 
=   91.3914 


95  Percent  confidence  interval  around  the  mean  =   12011.7793  -  12370.0342 

Variance  (unbiased)  =   3048620.4693 

Standard  deviation  (unbiased)    =   1746.0299 

Skewness       =  -0.2835 

Kurtosis       =   2.7637 

Ko Imogorov-Sm i rnov  statistic  for  normality  =   1.5542 

Valid  cases       =  365 
'Missing  cases     -  0 
Response  percent  =  100.0  % 
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DESCRIPTIVE  STATISTICS  FOR  RUN  17 


Number 
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DESCRIPTIVE  STATISTICS  FOR  RUN  18 
concent  ration 


Minimum  =   7.276834 

Maximum  =   102.7069 

Range  =   95.4301 

Sum  =   17458.407344 

Mean  =   47.8313 

Median  =   47.0727 

Mode  =   Multi-Modal 

Variance  =   150.4505 

Standard  deviation       =   12.2658 

Standard  error  of  the  mean   =   0.6429 

95  Percent  confidence  interval  around  the  mean  -       46.5712  -  49.0913 

Variance  (unbiased)  -   150.8638 

Standard  deviation  (unbiased)    =   12.2827 

Skewness       =   0 . 3936 

Kurtosi  s       -       4 . 0401 

Kol mogorov- Smi rnov  statistic  for  normality  =   1.1431 

Valid  cases       -    365 
Missing  cases       0 
Response  percent  =  100.0  * 
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DESCRIPTIVE  STATISTICS  FOR  RUN  18 
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DESCRIPTIVE  STATISTICS  FOR  RUN  19 
concent  ration 


Minimum  = 

Maximum  = 

Range  = 

Sum  = 

Mean  = 

Median  = 

Mode  - 

Variance  = 

Standard  deviation       = 

Standard  error  of  the  mean 


67.08031 
1300 

1232.9197 
47272.92964 
129.5149 
128.9166 
Multi-Modal 
4113.6958  . 
64. 1381 
=   3.3617 


95  Percent  confidence  interval  around  the  mean  =   122.9258  -  136.1039 

Variance  (unbiased)  =   4124.9972 

Standard  deviation  (unbiased)    =   64.2261 

Skewness       =   16.6282 

Kurtosis       =   303.9075 

Kolmogorov- Smirnov  statistic  for  normality  =   6.3586 


Valid  cases       =  365 
Missing  cases     =  0 
Response  percent  =  100.0  * 
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DESCRIPTIVE  STATISTICS  FOR  RUN  19 
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C3:  DATA  USED  TO  REPRESENT  DIFFERENT 
INDUSTRIAL  VARIABILITY  LEVELS 
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o 

CM 


O 
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CLASS 


9 

18 
27 
36 
45 
54 
S3 
72 
81 


17 

26 
35 
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53 
62 
71 
8B 
98 


2 

0 
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ft  BSO  L  U  T  E     FREP  U  EN  C  ?    DISTRIBUTION 
LOU  UARIABILII^  INDUSTRY    ;DAIft00UL,PRN: 


CLASS 

B 
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0,00 

1£ 

- 

26 
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Z? 
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35 
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62 
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63 

- 

71 
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7? 

- 

86 
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Si 

" 

96 

6.27 
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"-S&^..-  33.  ij 


CUflULATIUE 
FREQUEHCi 

B.5S 
8.55 
Z.47 

13,42 
S  52.66 

85.75 

96.71 

99.18 

99.73 
106.06 


REL  ft  T I  U  E    r  R  E  0  U  EN  C  ^    D  I 
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T  R  I  B  U  I  1  0  K 


63 


C  R  I  !  T  !  V  t 


C:\.iPs;gn\DATAOOVL.PR!; 


THE  HEAH  JS 

iHE  SD  U- 

m   HINIMUH  IS 

THE  flAXlHUfI  Ij 

RANGE  IS 

THE  VARIANCE  Or  X  IS 

THE  SfEWNESS  COEFFICIt!-:  ^S 

THE  kURTOSIS  nOEFFICIENT  IS 

^H"  EXCESS  COrmCJEKT  ]': 

THE  COEFFICIENT  OF  VARIATION 


45.190 

9.442 
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*  *  *  ^CONFIDENCE  INTERVAL  ANALYSIS  *  *  *  * 

GARTNER  LEE  LIHITED 
SIMULATION  RUN  NUMeER  VL 
ll-21-r'88 
21:25:46 
STATISTICS  CALCULATED  USING  POPULATION  HEAN  (Mu) 

DATASET  C:\DESIGN\DATAOOVL.PRN 


iTllJTU 

' 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D  30 

45.76 

41.54 

37.31 

8.44 

11.34 

128.52 

-0.05 

-1.49 

3.71 

0.00 

1 

T   13 

48.20 

43.31 

38.41 

V.79 

8.10 

65.56 

0.07 

-0.13 

-1.05 

4.26 

2 

U   4 

59.58 

40.41 

21.24 

38.34 

12.06 

145.34 

-0.71 

0.38 

-1.34 

-2.71 

-I 
1 

MONTH 

2 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

E.XCESS 

DELTA 

FREO 

D   30 

51.85 

46.86 

41.88 

9.97 

13.3V 

17V. 27 

-0.04 

-O.an 

I.:- 

0.00 

1 

T   13 

52.01 

46.2'5 

40.58 

11.43 

9.45 

89.39 

0.26 

-0.63 

-0.59 

-1.22 

2 

U   4 

93.45 

50.08 

6.71 

86.75 

27.28 

744.13 

-0.66 

-0.34 

-1.33 

6.86 

7 

MONTH 

3 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREO 

D   30 

54.04 

49.85 

45.t.5 

8.40 

11.27 

127. Ob 

0.54 

1.69 

3.52 

0.00 

1 

T   13 

58.80 

50.40 

42.00 

16.80 

13.8V 

192.97 

0.27 

1.75 

2.77 

1.12 

2 

W   4 

6t..28 

49.64 

32.9V 

33.29 

10.4^ 

109.61 

-0.23 

0.98 

-0.85 

-0.42 

7 

MONTH 

4 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREO 

D   30 

46.05 

43.47 

40.89 

5.16 

6.93 

48.03 

0.27 

0.46 

0.27 

0.00 

1 

T   12 

48.52 

44. 4i 

40.37 

8.15 

6.42 

41.  r^ 

0.4-5 

1.79 

2.43 

2.24 

L 

W   5 

52.2^^ 

43.lt. 

34.03 

18.26 

7.34 

53.94 

-0.13 

-0.47 

-0.50 

-0.71 

1 

MONTH 

5 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREO 

D   30 

47.17 

44.64 

42.12 

5.05 

6.77 

45.90 

-0.03 

0.1! 

-Q.n 

O.OP 

1 

i   13 

4>.0o 

44.42 

3'?./ 5 

9.33 

7.72 

59.59 

0.34 

-O.Oi 

-1.24 

-0.51 

2 

U   4 

52.03 

45.38 

38.74 

13.28 

4.18 

17.45 

-0.58 

0.95 

-0.63 

1.66 

7 

MONTH 

6 

FR  N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D  30 

48.55 

45.89 

43.23 

5.32 

7.14 

50.94 

-0.25 

-0.49 

0.49 

0.00 

1 

T   13 

50.29 

41.75 

39.21 

11.07 

9.16 

83.86 

0.35 

-0.60 

-0.2'5 

-2.48 

2 

U   4 

48.51 

44.55 

40.5" 

7.92 

2.49 

6.21 

-0.0! 

-1.45 

-0.63 

-2.92 

7 
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MONTH        • 
FR   N      UCL 


HEAN 


LCL 


RANGE 


SD 


VAR 


AC 


SKEW 


EXCESS   DELTA   FREQ 


50.99  48.32  45.64  5.35  7.18  51.52  0.09  0.15  -0.28  0.00  1 
53.21  49.41  45.60  7.61  5.99  35.88  -0.21  -0.43  -0.25  2.26  2 
55.51    44.90    34.28    21.23    8.54    72.89    -0.29    -0.84    -1.22    -7.07    7 


MONTH       8 
FR   N      UCL 


MEAN 


LCL 


RANGE 


SD 


VAR 


AC 


SKEW 


EXCESS   DELTA   FREQ 


44.88 
48.19 
44.46 


41.35  37.82  7.06  9.48  89.79  0.16  -0.13  -0.73  0.00  1 
41.19  34.19  14.00  11.58  134.07  0.10  -0.02  -1.24  -0.39  2 
39.88    35.30    9.16    2.88    8.31    -0.11    -1.08    -1.33    -3.55    7 


MONTH 
FR 


UCL 


HEAN 


LCL 


RANGE 


SD 


VAR 


AC 


SKEW 


EXCESS   DELTA   FRE« 


D  30  50.53  4?. 58  44.64  5.3'^  7.^  62.54  O.ll  0.53  -0.31  0.00 
T  13  51.1!  46.16  41.22  9.89  8.18  66.91  0.33  -0.08  -0.68  -2.9'^ 
W   4     58.63    46.63    34.62    24.01     7.55    57.0?    -0.73    -0.41    -1.16    -2.01 


MONTH       10 
FR   N      UCL 


D  30 
T  13 
W   4 


HEAN 


LCL 


RANGE 


SD 


VAR 


dKEW 


EXCESS   DELTA   FREQ 


D  30 

4". 75 

45.30 

42.86 

4.  CO 

6.55 

42.92 

-0.0^ 

-0.37 

-0.26 

0.00 

1 

T   13 

50.02 

46.13 

42.25 

7.76 

6.42 

41.21 

-0.05 

0.21 

-1.44 

1.83 

2 

W    4 

5^.12 

44.53 

32.^4 

24.38 

^6"' 

58. 7t. 

-0.77 

0.38 

-1.36 

-0.84 

T 

MONTH 

11 

FR   (J 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D  30 

50.00 

4fc.C] 

42.01 

7  QO 

10. '2 

114.^3 

-0.03 

0.78 

-0.03 

G.OO 

1 

T   12 

53.05 

46.82 

40.  cO 

12.45 

9.81 

%.15 

0.26 

1.09 

-0.07 

1.78 

2 

H   5 

48.46 

43.19 

37.92 

10.53 

4.24 

17.95 

-0.2'3 

-1.62 

-0.11 

-b.l2 

HONTH 

12 

FR  N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

46.08 
61.06 


41.77 
41.80 
41.46 


39.03 
37.51 
21.86 


5.4'5 

8.58 

39.19 


7.3?         54.26         -0.17  0.39  0.2e  0.00  1 

7.09         50.31         -0.27         -0.20         -0.70  0.05  2 

12.32       151.90         -0.39  0.64         -0.97         -0.75  7 


Cojputation  tiae  -      3.0    iinutes. 
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CLASS 


e 

-  14 

15 

-  Z9 

38 

-  W 

45 

-  59 

6B 

-  74 

75 

-  89 

98 

-  184 

185 

-  119 

128 

-  134 

135 

-  154 

ftBSOLUTE     FREQUENCE    DISTRIBUTION 
HEDIUK  UftRIABILITll  INDUSTRY  (DftTftBBUH.PRN) ^ 


CLASS 


15 

36 
45 
80 


IBb 
IZB 

135 


-  14 

1  1,18 

-  29 

m   ^-^^ 

-  44 

flnHI  18  % 

-  59 

^^E^  13  qz 

-  74 

^^^^SS  ^^ 

-  89 

l^«W         5J.  J 

-  184 

15.34 

-  119 

|g™™,jj^  XI  51 

-  134 

■  4  38 

-  154 

1  2  i5 

CUHULftTHIE 
FREQUEHCi 

1.18 

4.11 

15.87 

28.49 
48.49 
66.58 
81.9Z 
93.42 
97.81 
186.88 


RELAT  I  UE    FREQUEHC  !(    DI  STRI  BUT!  OH 
HEDIUK  UARIABILITX  INDUSTRY  (DATABBUh.PRW: 
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(:i9  - 1 4  - 1 988    14:12:04 

D    E    S    C    R    I    F'    T    I    V    E 

C :  \desi gn\r>ATAOnVM .  PRN 

THE    MEAN    IS 
THE    SD    IS 
THE    MINIMUM    IS 
THE    MAXIMUM    IS 
RANGE    1? 

THE    VARIANCE    OF    X    IS 
THE    SKEWNESS    COEFFICIENT 
THE    KURTOSIS    COEFFICIENT 
THE    EXCESS    COEFFICIENT    IS 
THE    COEFFICIENT     OF    VARIATION 
AUTOCORR.     COFFFTCFNf     (LAQ-J) 


IS 
IS 


TATISTIC 


76.315 
28.691 
0.000 
154.000 
154.000 
823. 197 
0 . 0069 
2.5471 
-0 . 4529 
IS  37.6 

IS  0.0755 


*  *  *  *CONFIDENCE  INTERVAL  ANAlTSIS  *  ♦  *  * 

GARTNER  LEE  LIMITED 
SIMULATION  RUN  NUMBER  Vf< 

09-u-im 

14:27:45 

STATISTICS  CALCULATED  USING  POFULATIOK  MEAN  (Mu) 

DATASET  C:\DESIGN\DATAO0VK.PSN 
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40 
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47 
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54 
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125 

95 
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54 

35 

60 

122 

54 

106 
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130 
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122 

69 

70 

33 
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50 

41 

4 '5 

104 

104 

80 

98 

11^ 

64 

108 
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140 

114 

70 

85 

116 

82 

98 

72 

81 

118 
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88 

119 

88 

63 

109 

83 

66 

115 

82 

50 

40 

74 

99 
88 
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121 

5.: 

137 

137 

120 

68 

34 

104 

88 

94 

lo- 
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54 

74 
50 

107 
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63 
106 

45 
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52 

66 
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68 
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114 

52 

108 

115 

46 
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103 

41 

114 

92 

40 

118 

Q? 

52 

126 

71 

!2i 

4ft 

118 

V' 

83 

50 

96 

108 

27 

71 

8i 

31 

19 

CI 

"b 

78 

46 

116 

128 
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80 

48 
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57 

40 

6'5 

121 

137 
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74 

68 

40 

22 

67 

bi 
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SIMULATION  RUN  NUMBER  VM 

09-14-l%'8 

14:28:23 

MONTH 

I 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SHW 

EXCESS 

DELTA 

FREO 

D   30 

75.75 

65.20 

54.65 

21. 0^^ 

28.32 

801.89 

0.15 

-0.4^^ 

-0.37 

0.00 

1 

T   13 

30.50 

64.38 

48.27 

32.22 

26.65 

709.98 

-0.05 

-0.33 

0.01 

-1.25 

2 

M   4 

88.69 

^2.25 

55.81 

32 .  88 

10.34 

106.8*5 

-0.67 

1.33 

-1.15 

10.81 

7 

HONTH 

2 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREO 

D   30 

fa'V.27 

76.30 

63.33 

25. ='3 

34.81 

1212.08 

0.31 

0.14 

-0.25 

0.00 

1 

T   13 

98.-3 

81.85 

64.77 

34.16 

28.25 

797.97 

0.47 

0.43 

-1.40 

7.2? 

C 

M   4 

171.33 

83.00 

0.00 

176.65 

55.55 

3085.89 

-0.38 

0.05 

-1.15 

0 .  i"6 

7 

MONTH 

5 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

88.69 

80.40 

72.11 

16.58 

22.26 

4^5.37 

-0.3/ 

-0.30 

-0.46 

0.00 

1 

T   13 

33.55 

70.23 

56.^1 

26.63 

22.03 

485.13 

-O.Ci 

-1.5'' 

0.62 

-12.65 

2 

U   4 

141.33 

101.50 

61.67 

79.66 

25.05 

627.46 

-0.43 

!.2'' 

-1.26 

■^6.24 

7 

HONTH 

4 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

:*EW 

EXCESS 

DELTA 

FREO 

D   30 

77.46 

t'.4^ 

57. 4"= 

r^.?^ 

26.83 

720.12 

-0.32 

-0.02 

-1.23 

0.00 

1 

T   12 

73. ''■^ 

60.33 

46.68 

27.51 

21.50 

462.27 

-0.28 

-0.57 

-0.4'5 

-10.57 

2 

W   5 

125.90 

S'^.20 

52.50 

73.40 

2''.  52 

871.30 

-O.O'^ 

0.9V 

-1.32 

32.21 

? 

MONTH 

5 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

82.66 

74.33 

66.01 

16.65 

22.35 

499.36 

-0.57 

-0.38 

-1.0'^ 

0.00 

1 

T   13 

36.68 

72.77 

58.85 

27.83 

23.02 

529.70 

0.11 

-0.35 

-1.18 

-2.10 

2 

U   4 

112.61 

64.50 

16.39 

96.22 

30.26 

915.44 

-0.06 

-0.79 

-1.54 

-13.23 

7 

HONTH 

6 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

72.54 

63.63 

54.73 

17.31 

23. '^O 

571.37 

-0.74 

0.25 

-1.36 

0.00 

1 

T   13 

78.57 

65.23 

51.3'' 

26.68 

22.06 

486.73 

0.24 

0.37 

-1.36 

2.51 

2 

W   4 

100.51 

71.25 

41.9^ 

58.52 

18.40 

338.70 

-0.01 

O.vl 

-1.27 

11.97 

7 
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MONTH       7 
FR   N      UCL 


HEAN 


RANGE 


VAR 


SrEW 


EXCESS   DELTA   FREfi 


D      30 

95.04 

83.17 

71.29 

23.75 

J 1  .  00 

1016.27 

0.20 

0.0'^ 

-1.36 

0.00 

1 

T      12 

■5^.61 

75.67 

54.72 

41. 89 

32.98 

1087.81 

0.41 

-0.24 

-1.70 

-'^.02 

2 

W       5 

117.35 

86.20 

55.05 

62.30 

25.06 

627.76 

-0.42 

0.6? 

-1.00 

3.65 

7 

HONTH 

0 

FR      N 

UCL 

MEAN 

LCL 

RANGE 

SO 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D      50 

97.34 

85.77 

74.19 

23.15 

31.08 

966.05 

0.44 

-0.55 

0.15 

0.00 

1 

T      13 

105.56 

85.00 

64.44 

41.13 

34.01 

1156.74 

0.11 

-1.01 

0.70 

-0.89 

2 

H       4 

144. 91 

94.75 

44.59 

100.33 

31.55 

995.39 

-0.05 

1.20 

-0.71 

10.47 

7 

HONTH 

9 

FR      N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SUM 

EXCESS 

DELTA 

FREQ 

D      30 

90.54 

81.00 

71.46 

19.09 

25.63 

656.67 

0.28 

0.5t. 

-0.48 

COO 

1 

T      13 

101.74 

85.08 

68.41 

33.32 

27.56 

759.46 

0.07 

0.69 

-0.6! 

5.03 

2 

H        4 

107.58 

81.00 

54.42 

53.16 

16.72 

279.50 

-0.03 

-0.33 

-1.50 

0.00 

/ 

HONIH 

10 

FR      N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

Si^EW 

EXCESS 

DELTA 

FREQ 

D      30 

84.03 

76.87 

o9.70 

14.33 

r5.23 

369.85 

-0.15 

0.36 

-0.57 

0.00 

1 

T      13 

79.63 

69.92 

t.0.21 

19.42 

16.06 

257.82 

-0.45 

-l.P 

-0.64 

-'^.03 

2 

k'       4 

104.07 

86.50 

68.93 

35.13 

11.05 

122.05 

-0.34 

1.29 

-1.22 

12.53 

-! 

MONTH 

11 

FR      N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D      30 

96.35 

86.53 

"'          TO 

19.64 

26.36 

694.92 

-0.47 

-0.1'^ 

-1.23 

0.00 

1 

T      12 

110.49 

96.83 

83.  !■ 

27.32 

21.51 

462.56 

0.10 

0.52 

-1.3! 

11. ''0 

2 

H       5 

114.47 

90.60 

66.73 

47.75 

19.20 

368 . 78 

-0.69 

0.61 

-1.52 

4.70 

1 

MONTH 

12 

FR      H 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D      30 

92.06 

7^.13 

66.20 

25.86 

54.71 

1204.92 

0.28 

0.02 

-1.16 

0.00 

1 

T      13 

104.76 

83.92 

63.09 

41.67 

34.46 

1187.32 

-0.61 

0.36 

-1.31 

6.05 

2 

H       4 

135. S3 

81.00 

2r..l7 

109.65 

34.48 

H88.98 

-0.04 

-0.3'i' 

-1.14 

2.36 

7 

Coaputation 

tiae  : 

3.4    iinijtes 
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CLASS 

B 

19 

28 

39 

BHiiiil  ^^ 

48 

59 

IIIIIIJ  35 

68 

-     79 

j||i|||i||  37 

88 

99 

iBH22 

188 

-    119 

iiliiiii  ^^ 

128 

-    139 

ililH26 

148 

-    159 

11111  Zl 

168 

-    179 

lllpiB 

188 

-    282 

Ifc24 

ft  BSO  LU  T  E 

F  R  E  Q  U  E  H  C  y    DISTRIBUTION 

HIGH  UftKIABILITV  INDUSTRV  (DftTft88UH.PRN]                                            | 

CUHULftTIUE 
CLASS  FREQUEHCy 


32,33  32.33 

42.19 
51,78 
61.92 
67.95 
76,16 
83.29 
89,84 
93,42 


R  E  L  A  T  I  U  E    FREQUENCE    DISTRIBUTION 
HIGH  UARIABILITf  INDUSTRY  (DATASBUH.PRN) 


B 

-     19 

26 

-     39 

46 

-     59 

66 

-     79 

86 

-     99 

186 

-    119 

126 

-    139 

146 

-    159 

166 

-    179 

186 

-    2B2 
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IHE  SD  !S 
THE  HINIHUH  IS 
THE  flAXIMUH  IS 
RANGE  IS 

THE  VARIAtiCE  ''F  «  IS 
THE  StEWNESS  COEFFICIENT   U 
THE  rURTOSIS  COEFFICIENT  IS 
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THE  COEFFICIENT  OF  VAhlATIij 
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GARTNER  LEf  il^'HED 
SIHULATI'iH  RUN  NUHEER  VH 

14:4S:0J 

STATISIICS  CALCULAlEi- 


r.ft.Ot-1 

b2.t.22 
0.000 
20  i. '533 
201.  «3 
3921.552 
0.5435 
2.0736 
-0.^2c.4 
^2.0 

■:■;  1  *  i   I 
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DATASE! 


O   135 


lg2  54  J  104  l\}v  1  7  ]"':  i2  1  86 

12-!  1  25  i-i''^  or-  (i  '',  12-  7  'i~,  13" 

14J  -■'-  0  r,'^  ni  6  0  85  96  0  9 

78  ivi  115  r  24  IT  45  0  27  14/  153 

'•4  143  46  IIH  200  190  59  36  114  5-  0 

IGi  1-0  K-  ""  ;-.  Iv-  39  G  ■-!  -,7  [ 
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151 

las 

j   : 

48 

173 

T7 

152 

r, 

22 

pfi] 

iRG 

";o 

20 

185 

126 

i 

2 

62 

33 

0 

144 

2G2 

Oj 

■^i^ 

16H 

'":l 

0 

88 

116 

0 

137 

0 

n 

'23 

5" 

1 

ii 

23 

80 

200 

Ifcc 

6- 

153 
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0 

25 

134 

85 

56 

18 

85 

172 

131 

0 

2" 

129 

42 

0 

131 

110 

o4 

V': 

5u 

G 

'2 

152 

' 

! 

23 

59 

:'! 

3? 

34 

•100 

60 

59 

45 

2 

T'O 

2 

111 

106 

2 

124 

1 75 

6 

72 

183 

173 

2 

0 

2^0 

165 

2 

n 

1 

119 

132 

60 

68 

147 

4 

0 

84 

200 

195 

-' 

0 

5i 

140 

5.: 

7 

146 

119 

2 

] 

0 

118 

42 

70 

07 

111 

;r 

44   127 
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*  *  *  *CONFIDENCE  INTERVAL  ANALYSIS  t  *  *  * 

GARTNER  LEE  LIHITED 
SIHULATION  RUN  NUMBER  VH 
11-21-1988 
21:29:16 
STATISTICS  CALCULATED  USING  POPULATION  HEAN  (Hu) 

DATASET  C:\DESIGN\DATAOOVH.PRN 


HONTH 
FR   N 


UCL 


HEAN 


LCL 


RANGE 


SD 


VAR 


SKEW 


EXCESS   DELTA   FREO 


D  30 

62.55 

41.72 

20.88 

41.68 

55.95 

3130.34 

0.55 

1.58 

1.85 

0.00 

1 

T   13 

69.33 

41.83 

14.33 

55.00 

45.48 

2068.76 

0.08 

0.68 

-0.84 

0.27 

2 

H   4 

201.7? 

70.76 

0.00 

262.02 

82.40 

6789.20 

-0.14 

1.74 

0.41 

69.62 

7 

MONTH 

2 

FR   N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D  30 

52.1- 

35.  Oc 

17.94 

34.25 

45.98 

2113.95 

0.11 

1.4? 

2.14 

COG 

1 

T   13 

59.90 

36.91 

13.93 

45.9? 

38.02 

1445.44 

-0.11 

0.62 

-1.28 

5.28 

2 

U        i 

^i.^" 

6.38 

0.00 

95.97 

30.18 

910.74 

-0.13 

-1.10 

-1.77 

-81.80 

7 

HON  IN 

; 

FR   N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

sr.Ew 

EXCESS 

DELTA 

FREQ 

D  30 

69.95 

50.0^ 

30.25 

07.71 

53.31 

2842.2^ 

0.35 

1.1b 

0.74 

0.00 

1 

T   13 

OT  -n 

c.  ,;! 

25.51 

61.79 

51.10 

2611.29 

0.23 

1.00 

-0.02 

12.61 

2 

|i     .3 

0.00 

115.93 

36. 4- 

1329.08 

-0.05 

-1.27 

-1.25 

-49.32 

7 

MONTH 

: 

FR   N 

■A.. 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEK 

EXCESS 

DELTA 

FREQ 

D   30 

81.53 

60.61 

39.68 

41.85 

56.18 

3156.34 

0.19 

0.78 

-0.26 

0.00 

1 

T   12 

94. '55 

57. ol 

20.27 

74.68 

58.80 

3456.94 

-0.35 

0.99 

0.38 

-4.95 

2 

H   5 

144.10 

^C.Of: 

0.00 

168.03 

67.58 

4566.65 

0.18 

0.48 

-1.61 

-0.87 

1 
t 

HONTH 

5 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEH 

EXCESS 

DELTA 

FREO 

D   30 

123.15 

96.50 

69.84 

53.3! 

71.56 

5121.15 

-0.03 

-0.06 

-1.58 

0.00 

1 

T   13 

142.31 

103.55 

64.78 

:'7.53 

64.12 

4111.12 

-0.08 

-0.07 

-1.53 

7.30 

2 

H   4 

298.70 

172.16 

45.62 

253.08 

79.58 

6333.75 

-0.19 

1.12 

-1.71 

78.41 

7 

MONTH 

6 

FR   N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D  30  112.37  86.24  60.12  52.26 
T  13  112.03  73.82  35.61  76.42 
U   4    253.52   121.68    0.00   263.67 


70.15 

4921.4'* 

-0.02 

63.20 

3993.95 

-0.18 

82.92 

6875.06 

-0.29 

0.15  -1.50  0.00  1 
-0.17  -1.48  -14.40  2 
0.59    -1.62    41.09     7 


-  75 


MONTH 
FR   N 


7 
UCL 


MFAN 


LCL 


RANGE 


VAR 


SKEW 


EXCESS   DELTA   FREQ 


D  30 

94. '57 

70.41 

45.86 

49.11 

65.92 

4345.73 

0.03 

0.52 

-1.07 

0.00 

1 

T   12 

116.83 

70.24 

23.66 

93.17 

73.35 

5380.32 

-0.41 

0.41 

-1.52 

-0.24 

2 

W   5 

142.40 

54.65 

0.00 

175.49 

70.58 

4981.06 

0.11 

-0.01 

-1.70 

-22.39 

7 

MONTH 

s 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

84. '^0 

63.86 

42.83 

42.07 

56.48 

3189.56 

-0.04 

0.23 

-1.28 

0.00 

1 

T   13 

86.83 

47.35 

7.86 

78.97 

65.31 

4265.30 

-0.32 

0.07 

-1.50 

-25.86 

2 

U   4 

151.2? 

^4.26 

37.25 

114.02 

35.85 

1285.53 

-0.10 

1.31 

-1.20 

47.60 

7 

MONTH 

9 

FR   N 

UCL 

MFAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREO 

D   30 

108.77 

85.87 

62.98 

45.79 

bl.48 

3779.31 

0.15 

0.24 

-0.4? 

0.00 

1 

T   13 

123.73 

8'^.  40 

55.06 

68.67 

56.78 

3224.45 

0.11 

0.42 

-0.62 

4.10 

2 

W   4 

11 '^'.25 

45.5? 

0.00 

147.37 

46.34 

2147.58 

-0.01 

-1.2<^' 

-1.28 

-46.93 

7 

MONTH 

10 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

92.23 

70.3:' 

48.51 

43.72 

58.70 

3445.44 

0.04 

0.28 

-1.24 

0.00 

1 

T   13 

102. 3-' 

70.5'i 

38.71 

63.66 

52.65 

2771.68 

-0.56 

0.03 

-1.4? 

0.24 

L 

W   4 

154.37 

78.38 

0.00 

lfcl.97 

50.93 

2594.23 

-0.03 

0.00 

-1.25 

11.38 

1 

MONTH 

11 

FR   N 

UCl 

MFAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREO 

D   30 

103.15 

78.93 

54.82 

48.32 

64.87 

4208.58 

0.12 

0.23 

-1.30 

0.00 

1 

T   12 

134.1? 

'-'i.r 

48.23 

85.84 

67.62 

4572.63 

-0.38 

o.i; 

-1.5' 

15.43 

L 

W   5 

135.62 

73.71 

11.81 

123.81 

49. 7Q 

2479.24 

-0.2? 

1.10 

-0.44 

-6.67 

1 

MONTH 

12 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D  30 

97.61 

76.27 

54. "2 

42.6'^ 

57.31 

3284.45 

0.05 

0.31 

-0.9b 

0.00 

1 

T   13 

104.40 

72.45 

40.51 

63.90 

52.84 

2792.06 

-0.49 

0.62 

0.13 

-5.00 

2 

W   4 

225.61 

128.83 

32.05 

193.55 

60.87 

3704.67 

-0.02 

1.27 

-1.32 

68.93 

7 

Computation  time 


:,9  sinutes. 
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CLASS 

8  -  14 

ttacaffi^^       33(, 

15  -  29 

H  24 

38  -  44 

1 

45  -  59 

0 

68  -  74 

8 

75  -  89 

1 

98  -  184 

8 

105  -  119 

0 

120  -  134 

0 

135  -  149 

3 

A  B  S  0  L  U  1  E 

F  R  E  Q  U  E  H  C  ¥  D I  S I R I  B  U II  0  H 

\)m  HIGH  UARIABILIiy  INDUSIRJ  (DATA88UU.PIIN) 

CUHULATIUE 

CLASS 

FREQUENCE 

8  -  14 

15  -  29 

1  b.58               98 

63 

30   -   44 

0.27                98 

90 

45  -  59 

0 

80               98 

98 

68  -  74 

e 

88               98 

98 

75  -  85 

8 

27                99 

18 

98  -  104 

0 

80               99 

18 

105  -  119 

8 

08               99 

18 

126  -  134 

8 

08                99 

18 

135  -  145 

6 

8Z               188 

86 

R  E  L  A  TI  U  E 

FREQUEKC  J  DISTRIBUTION 

UER{  HIGH  UARIABILITi!  INDUSTRY  (DATA00UU.PRN] 
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J    I 


1    C    S 


THE 

TIIE 

THE 

THE 

RANG 

THE 

THE 

the: 

THE 

the 

AUTO 


MEA 

SD 

MIN 

MAX 

E  I 

VAR 

SKt 

KUf 

EXC 

COE 

COF 


M  IS 

IS 

I MUM  IS 

I MUM  IS 

lAMCE  OF  X  IS 
WNESS  COEFFICIEMl  IS 
TOSIS  COEFFICIEhrr  IS 
ESS  COEFEICIENT  IS 
FFICIENT  OE  VARIATION 
R.  COEEFICENT  (LAG^l) 


IS 


4 
1  "v 

:i  40 

1  -J 
0 

bi'o'r.-'. 

000 

149 

000 

149 

000 

191 

863 

8.4' 

^22 

32 . 0592 

79 .  Oi 

i92 

334.6 

0.7r 

;.43 

*  ♦  ♦  ^CONFIDENCE  INTERVAL  ANALYSIS  *  *  *  * 

GARTNER  LEE  LIMITED 
SIMULATION  RUN  NUMBER  VV 
1I-21-1'588 
21:36:03 
STATISTICS  CALCULATED  USING  P'liPULATION  MEAN  (Mu! 

DATASET  C:\DESIGN\DATAO0Vy.PRN 


0 

1 

0 

0 

n 

0 
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0 

0 

2 

1 

0 
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1 

0 

0 

1 

1 

2 

0 

1 

0 

0 

0 

0 

1 

1 

1 

0 

1 

0 

1 

0 

0 

0 

0 

0 

1 

16 

18 

16 

1 

2 

19 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

1 

3 

2 

16 

3 

0 

0 

1 

2 

3 

1 

0 

0 

3 

0 

1 

3 

0 

2 

0 

0 

7 

2 

2 

1 

3 

18 

1 

1 

2 

3 

0 

0 

2 

1 

0 

0 

0 

1 

33 

137 

14.3 

138 

77 

! 

n 

2 

3 

2 

1 

2 

0 

0 

0 

0 

0 

0 

4 

IS 

2 

1 

17 

i. 

2 

L 

15 

1 

0 

1 

0 

2 

2 

3 

5 

0 

2 

2 

1 

21 

4 

c 

1'^ 

i 

16 

1 

p 

13 

; 

! 

;? 

! 

16 

3 

0 

1 

0 

2 

0 

0 

3 

0 

1 

0 

2 

3 

2 

1 

3 

1 

4 

0 

! 

2 

1 

0 

3 

1 

1 

14 

2 

3 

1 

5 

! 

? 

1 

0 

4 

0 

1 

0 

0 

2 

3 

0 

4 

1 

0 

2 

1 

11 

1 

16 

4 

2 

4 

0 

4 

15 

1 

3 

3 

1 

^ 

1 

G 

1 

0 

3 

0 

0 

0 

1 

0 

2 

0 

1 

3 

0 

3 

2 

0 

4 

i. 

5 

1 

2 

5 

n 

2 

15 

1 

5 

2 

0 

2 

2 

2 
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3 
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4 

0 
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*  ♦  *  ^CONFIDENCE  INTERVAL  ANALYSIS  *  i  *  * 

GARTNER  LEE  LIMITED 
SIMULATION  RUN  NUMBER  VV 
11-21-1988 
21:32:33 
STATISTICS  CALCULATED  USING  POPULATION  MEAN  (Hu) 

DATASET  c:\DESIGN\DATAOOVV.PRN 


MONTH 

1 

PR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREU 

D   30 

0.85 

0.60 

0.35 

0.4'5 

0.66 

0.44 

0.09 

0.66 

-0.63 

0.00 

1 

T   13 

1.20 

0.77 

0.34 

0.87 

0.72 

0.51 

-0.11 

1.00 

-0.53 

28.21 

2 

U   4 

1.31 

0.50 

0.00 

1.62 

0.51 

0.26 

0.00 

-0.57 

-1.85 

-16.67 

7 

MONTH 

2 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

5.03 

2 .  '''O 

0.77 

4.25 

5.71 

32.62 

0.44 

2.10 

2.61 

0.00 

1 

T   13 

8.^0 

4.38 

0.06 

8.64 

7.15 

51.06 

0.55 

1.79 

0.73 

51.19 

2 

U   4 

4.22 

1.00 

0.00 

6.45 

2.03 

4.11 

0.00 

-1.1? 

-1.56 

-65.52 

7 

MONTH 

3 

FR   N 

UCL 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

2.6; 

1.60 

0.53 

2.15 

2.88 

8.31 

0.13 

4.11 

17.82 

0.00 

1 

T   13 

!.'r 

l.Oo 

0.40 

1.36 

1.13 

1.27 

-0.28 

-0.65 

-1.34 

-32.69 

2 

W   4 

16.58 

5.00 

0.00 

23.16 

7.28 

53.06 

-0.43 

1.^3 

0.82 

212.50 

7 

MONTH 

i 

FR   N 

un 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

35.40 

19.27 

Z.li 

32.26 

43.31 

1875.80 

0.80 

2.2? 

3.52 

0.00 

1 

I   12 

51.5/ 

23. "5 

0.00 

55.65 

43.81 

rn'5.45 

0,43 

2.31 

3.68 

23.27 

L 

W   5 

23.  /8 

0.80 

0.00 

45.96 

18.48 

341.58 

-0.02 

-1.00 

-l.QQ 

-V5.85 

7 

MONTH 

5 

FR   N 

UCl 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREU 

D   30 

a.bb 

2.V0 

l.H 

3.53 

4.74 

22.42 

0.05 

2.44 

4.53 

0.00 

1 

T   13 

4.7? 

2.46 

0.15 

4.62 

3.82 

14.59 

-0.08 

2.30 

4.85 

-15.12 

L 

W   4 

16.74 

5.25 

0.00 

22  ''8 

7.23 

52.21 

-0-.37 

1.84 

0.63 

81.03 

7 

MONTH 

. 

FR   N 

UCl 

MEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREU 

D   30 

7.76 

5.33 

2.^1 

4.84 

6.50 

42.2^ 

-0.24 

1.26 

-O.Ob 

0.00 

1 

T   13 

8.8' 

5.15 

1.43 

7.44 

6.15 

37.85 

0.58 

1.21 

-0.12 

-3.37 

2 

W   4 

24.87 

^.25 

0.00 

31.24 

■5.82 

96.53 

-0.86 

i.r^ 

-1.1? 

73.44 

7 

80 


HONTH 

7 

FR   N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

S(!EH 

EXCESS 

DELTA 

FREO 

D   30 

2.?7 

1.80 

0.83 

1.95 

2.61 

6.83 

-0.06 

3.38 

13.01 

0.00 

1 

T   12 

4.96 

2.58 

0.21 

4.76 

3.74 

14.02 

-0.29 

2.8^ 

6.41 

43.52 

2 

H   5 

3.94 

1.80 

0.00 

4.28 

1.72 

2.96 

-0.01 

1.02 

-0.35 

0.00 

7 

MONTH 

8 

FR   N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

4.11 

2.83 

1.56 

2.55 

3.43 

11.74 

-0.11 

2.23 

5.52 

0.00 

I 

T   13 

6.74 

3.92 

1.11 

5.64 

4.66 

21.72 

0.49 

2.14 

2.67 

38.46 

2 

U    4 

5.01 

I.OC 

0.00 

8.02 

2.52 

6.36 

-0.23 

-1.04 

-  i .  /  9 

-64.71 

7 

MONTH       9 
FR   N      UCL 


HEAN 


RANGE 


VAR 


AC 


sr.Ew 


EXCESS 


DELTA   FREO 


D   30 

2.70 

1.67 

0.63 

2.07 

2.78 

7.76 

0.08 

3.63 

14.60 

0.00 

1 

T   13 

1.64 

0.85 

0.05 

1.59 

1.31 

1.73 

0.38 

-0.83 

-1.62 

-49.23 

2 

U       4 

4.11 

1.-5 

0.00 

4.71 

1.48 

2.19 

-0.03 

0.60 

-1.05 

5.00 

7 

MONTH 

!0 

FR   N 

UCL 

HFAf. 

LCL 

RANGE 

SD 

VAR 

AC 

SKEH 

EXCESS 

DELTA 

FREQ 

D   30 

3.62 

2.60 

1.58 

2.05 

2.75 

7.57 

-0.21 

2.9Q 

10.95 

0.00 

1 

T   13 

3.0-5 

2.21 

1.37 

1.71 

1.42 

2.01 

-0.11 

0.01 

-0.59 

-14.20 

2 

U   4 

j."ti 

2. 50 

0.00 

6.56 

2.06 

4.26 

-0.94 

-0.15 

-1.7? 

-3.85 

-J 

MONTF 

11 

FR   N 

UCL 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SKEW 

EXCESS 

DELTA 

FREQ 

D   30 

7.03 

4.93 

2.83 

4.20 

5.64 

31.80 

0.17 

1.25 

O.iG 

0.00 

1 

T   12 

0.76 

C   *  -7 

1.57 

7.19 

5.66 

32.03 

-0.09 

1. 11 

-0.37 

4.:'3 

2 

H   5 

13. P 

5.00 

0.00 

16.34 

6.57 

43.20 

-0.3? 

1.45 

0.20 

1.35 

7 

HONTr 

12 

FR   N 

UCi 

HEAN 

LCL 

RANGE 

SD 

VAR 

AC 

SrEH 

EXCESS 

DELTA 

FREQ 

D   30  5.32    3.83  2.35  2.'?  3.99  15.94  -0.11  2.08  3.44  0.00 

T   13  5.74     3.69  1.65  4.09  3.39  11.46  0.22  1.85  3.53  -3.68 

H   4  15.56     5.50  0.00  20.12  6.33  40.03  -0.41  1.75  0.43  43.48 

Coiputation  tiie  :  3.0  iinutes. 
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